Marc Fischer
@marc_r_fischer
Co-Founder of @InvariantLabsAI, PhD student at ETH Zurich. I care about security and reliability of AI systems. @[email protected]
😈 BEWARE: Claude 4 + GitHub MCP will leak your private GitHub repositories, no questions asked. We discovered a new attack on agents using GitHub’s official MCP server, which can be exploited by attackers to access your private repositories. creds to @marco_milanta (1/n) 👇
One of our engineers, Hemang, has created this nice example repo of an MCP Streamable HTTP implementation. This is where things are heading for MCP, post SSE. We are also adding support to Gateway right now. github.com/invariantlabs-…
Thanks @ai_risks for the generous prize! AgentDojo is the reference for evaluating prompt injections in LLM agents, and is used for red-teaming at many frontier labs. I had a blast working on this with @edoardo_debe @JieZhang_ETH @marc_r_fischer @lbeurerkellner @mbalunovic
We are proud to share that AgentDojo, an Invariant research project done with @ETH, has won the first price of the @ai_risks SafeBench competition. We truly appreciate this recognition from the community. Learn More: invariantlabs.ai/blog/agentdojo…
Great write-up of MCP security, including our research from @InvariantLabsAI.
MCP is the hottest thing in AI right now, but people aren't really talking about the security implications... I covered a recently discovered exploit and mitigations on the @thenewstack today: thenewstack.io/building-with-…
We recently shipped a lot of updates to mcp-scan: - whitelisting of tools - Improvements to the server (reducing false-positives, improving detection) - run via npm/npx Much more coming soon! github.com/invariantlabs-… #mcp
I think Simon raises an important point here. LLM and agent security cannot be solved by a simpler classifier. Instead, Guardrails focuses on detecting guardrail violations on a behavioral level. It analyzes the data flow and active agent context, to make sure, that even if a…
It uses this model which isn't fit for purpose - but I don't believe that ANY trained model can credibly detect attacks well enough to be worth recommending huggingface.co/protectai/debe…
4/ How to safeguard? - Make sure only trusted MCP servers are being downloaded and used - Keep minimal funds in your crypto wallet MCP - Allow minimal access for MCP actions - Use MCP-Scan
🚀🔒 We created a security scanner to detect MCP attacks. Please check it out, and give feedback. * Supports Claude, Cursor, Windsurf • Checks for tool poisoning • Checks for rug pull (tool hashing) • Detects cross-origin violations (shadowing) uvx mcp-scan@latest
After covering MCP vulnerabilities over the last few days, today, we are launching MCP-scan, a security scanner to detect MCP attacks. Run it now: uvx mcp-scan@latest 🧵
🚀🔒 We created a security scanner to detect MCP attacks. Please check it out, and give feedback. * Supports Claude, Cursor, Windsurf • Checks for tool poisoning • Checks for rug pull (tool hashing) • Detects cross-origin violations (shadowing) uvx mcp-scan@latest
🛡️Thoughts on the MCP vulnerability and why it's not an easy fix (1/n) To stay updated about agent security, please follow and sign up for early access to Invariant below. We have been working on this problem for years (at Invariant and in research). invariantlabs.ai/guardrails