Luca Beurer-Kellner
@lbeurerkellner
working on secure agentic AI @invariantlabsai PhD @the_sri_lab, ETH Zürich. Prev: @lmqllang and @projectlve.
Attackers are playing a new game — automating rule-bending with AI. MCPs can introduce insecure code that looks legit on paper. Tools like MCP-Scan from @InvariantLabsAI help catch these hidden risks in real time. Check out the CSO breakdown: csoonline.com/article/401522…
Check my video on “Design Patterns for Securing LLM Agents against Prompt Injections” by @lbeurerkellner et all, with live code demos (repo in the comments). This also includes an implementation of Dual LLM by @simonw and CaMeL by @edoardo_debe et all. youtu.be/2Er7bmyhPfM
Thrilled to share a major step forward for AI for mathematical proof generation! We are releasing the Open Proof Corpus: the largest ever public collection of human-annotated LLM-generated math proofs, and a large-scale study over this dataset!
🚨 Security Advisory: Anthropic's Slack MCP Server leaks data via link unfurling ☠️ See a demo exploit with Claude Code connected to the MCP server, and how a prompt injection attack can leak developer secrets. Watch and learn!
We’re proud to announce we’ve acquired @InvariantLabsAI to deepen our defense against agentic AI threats. Invariant Labs joins Snyk Labs to advance real-time protection for AI-native apps. One platform. Full AI security. Learn more: bit.ly/3GdBTMk
Very important read. As MCP leads to broad adoption of agent systems, user education and posture are absolutely key.
If you use "AI agents" (LLMs that call tools) you need to be aware of the Lethal Trifecta Any time you combine access to private data with exposure to untrusted content and the ability to externally communicate an attacker can trick the system into stealing your data!
🌊Honored to announce that I was invited to present a talk at the European Lighthouse on Secure and Safe AI, GA 2025 @elsa_lighthouse If you are in and around Brussels this week and you want to meet and discuss AI agent safety, let me know. The talk will be on Wed morning.

There’s incredible potential in combining LLM-based code generation (“vibe coding”) with e.g. model-driven SWE. Also promising: programming languages and libraries designed specifically for both LLM generation and human readability. Still a lot of greenfield here.
Re: vibe coding, our field has a failure of imagination.
LLMs won’t tell you how to make fake IDs—but will reveal the layouts/materials of IDs and make realistic photos if asked separately. 💥Such decomposition attacks reach 87% success across QA, text-to-image, and agent settings! 🛡️Our monitoring method defends with 93% success! 🧵
Advanced prompt injection targets the LLM’s core logic, aiming to not only make it output some weird things but also to manipulate how the model interprets complex data and use its internal tools.
The ZombAIs have arrived in Codex! Prompt injection to C2. Be careful out there! This PoC uses a domain from the Common Dependencies allowlist when enabling restricted Internet access That allowed for a compromise via indirect prompt injection and getting command and control