This handbook is designed to help organizations address the rapidly evolving security challenges associated with enterprise AI adoption.
Piotr Ryciak

AI IDEs and coding agents expand the practical attack surface of development workflows by introducing new paths from untrusted workspace inputs to high-impact actions. This talk presents a catalog of exploitation patterns derived from vulnerability research across major AI-assisted IDEs and agents, including OpenAI Codex, Amazon Kiro, Google Antigravity, Cursor, and others, with a mix of issues already patched and others in active remediation. We organize findings by attacker effort and trigger model: zero-click paths, one-click paths, autorun behavior, and time-delayed execution. The talk is demo-driven and then generalizes beyond the demos to a repeatable playbook and checklist that security teams and builders can apply to assess and harden any AI-assisted IDE deployment.
When developers adopt AI-assisted IDEs like Cursor, Amazon Kiro, OpenAI Codex, or Google Antigravity, they're not just adding a productivity tool. They're introducing a new agent — one that can read files, execute commands, browse the web, and operate across their entire workspace — often without meaningful human supervision between actions.
Mindgard's vulnerability research team, building on our decade-long expertise in AI security, has conducted in-depth analysis across the major AI-assisted development environments. What we found confirms a pattern we've documented across the broader AI ecosystem: the more capable an AI agent becomes, the more exploitable its trust model is.
[un]prompted is an AI security practitioner conference focused on real-world offensive and defensive research, bringing together researchers, red teamers, and security leaders to explore how emerging AI systems are actually being exploited and secured in practice. Piotr Ryciak, AI Red Teamer at Mindgard, presented “Vibe Check: Security Failures in AI-Assisted IDEs,” a deeply technical, demo-driven session that dissected how modern coding agents introduce new attack paths from untrusted workspace inputs to high-impact execution. His talk went beyond isolated bugs, presenting a structured taxonomy of exploitation patterns, from zero-click to time-delayed attacks, derived from research across major AI IDEs and agents. The session emphasized that these vulnerabilities are not edge cases but systemic, offering security teams a repeatable playbook to assess and harden AI-assisted development environments.
A lot of developers run AI coding agents unsandboxed on their local machines. One crafted repo file can turn that setup into arbitrary code execution or data exfiltration. We've been here before — the browser wars taught us sandboxing the hard way. AI coding agents are making the same mistakes. Happy to share the talk I gave at [un]prompted where I break it all down.
Download the slides + demo videos here: https://lnkd.in/dgX6tZes
Traditional IDE security was straightforward — a developer's environment was largely isolated, and untrusted input (a file, a URL, a dependency) had limited ability to trigger meaningful actions. AI agents fundamentally break this model.
An AI coding agent is, by design, supposed to take initiative. It reads your workspace, infers intent, runs commands, makes web requests, and produces outputs — all without a one-to-one human prompt for each action. This is what makes it useful. It's also exactly what makes it dangerous when that initiative can be hijacked.
The core problem is the lethal trifecta of (1) simultaneous access to untrusted input, (2) private data, and (3) external communication channels. When all three conditions are met — and in most AI IDEs they are, by default — data exfiltration and arbitrary code execution become structurally achievable.
Mindgard security researchers have been able to surface vulnerabilities across all major AI-assisted IDEs. These can be organized by attacker effort and trigger model — from attacks requiring zero user interaction to those that lie dormant until a specific condition is met.
Static analysis tools scan code. SAST and DAST solutions examine application inputs and outputs. Neither is designed to reason about the behaviour of an embedded AI agent operating across your filesystem, shell, and browser simultaneously.
The attacks we've documented don't exploit a buffer overflow or an unpatched library. They exploit the intent-following behaviour of a language model — its tendency to reason toward completing a task, even when doing so means working around the restrictions placed on it. When Antigravity's agent wanted to access a .env file that its file-reading tool couldn't reach, it didn't fail — it reasoned its way to using run_command as an alternative path. That kind of adaptive exploitation is invisible to conventional tooling.
Research across these platforms has produced a repeatable set of hardening actions that security teams and developers can apply to any AI coding agent deployment:
What's happening in AI-assisted IDEs is a concentrated version of a challenge playing out across the entire enterprise AI landscape. As organisations race to deploy agentic AI — in development tools, in customer-facing products, in internal automation — the attack surface is expanding faster than security controls are being put in place.
The vulnerabilities found in Cursor, Kiro, Antigravity, and Codex are not bugs that will eventually be patched away. Some have been fixed. Others are being addressed. But the underlying structural condition — agents with broad system access operating on untrusted input with limited human oversight — is the product design, not a flaw in it. Security has to be engineered in from the start, not bolted on after researchers demonstrate exploitation.