Agent Instructions As Execution Paths: Arbitrary Command Execution in AI CLI Tooling

Mindgard discovered a Kilo AI CLI vulnerability where malicious AGENTS.md instructions can invoke execute_command and execute arbitrary shell commands.

Key Takeaways

  • Kilo AI CLI is an AI-assisted command-line development tool that is vulnerable to repository-controlled instruction injection leading to arbitrary command execution.
  • No CVE has been assigned at the time of writing. The issue has been disclosed publicly via GitHub discussion and acknowledged by the vendor, but no formal identifier has been issued.
  • Agent instruction files are interpreted as executable directives. The AGENTS.md file can inject instructions that the AI agent treats as authoritative, directly influencing tool invocation.
  • Command execution occurs through built-in tooling (execute_command). The agent translates instructions into a shell execution pathway, allowing arbitrary commands to run in the user environment.
  • User intent and approval boundaries are bypassed. Execution is triggered by ambient instructions rather than explicit user requests, undermining expected control mechanisms.
  • The issue reflects a broader pattern across AI tooling. Configuration and instruction layers are increasingly acting as execution surfaces, expanding the attack surface in developer environments.

Kilo AI is an AI-assisted command-line development tool that uses agent-driven reasoning and tool invocation to help developers automate coding and workflow tasks directly from the terminal.

AI-assisted CLI tools extend beyond static command interfaces by incorporating agent-driven reasoning and tool invocation. Kilo AI CLI exemplifies this shift by allowing agent behavior to be shaped through repository-level instruction files such as AGENTS.md.

Mindgard identified that this instruction layer can be manipulated to trigger arbitrary command execution through the agent’s execute_command capability. By embedding malicious directives in a repository, an attacker can influence how the agent interprets any subsequent user interaction, resulting in command execution that is detached from explicit user intent.

The vulnerability is easy to exploit. An attacker only needs to place a malicious AGENTS.md. file inside a repository and convince a developer to interact with that repository using Kilo AI CLI. Once loaded, the agent interprets attacker-controlled instructions as part of its operational context and may invoke shell commands even when the user provides unrelated or benign input.

The issue is not rooted in a traditional software flaw such as memory corruption or injection. Instead, it emerges from how the system interprets and prioritizes instructions across multiple layers: repository content, agent reasoning, and tool invocation. This creates an execution pathway where untrusted input is elevated into actionable system behavior.

The Issue: Instruction Injection Leading to Arbitrary Command Execution

Kilo AI CLI operates by interpreting user input alongside contextual instructions, including those defined in repository-level files such as AGENTS.md. These instructions are treated as authoritative guidance for agent behavior and can influence how the system selects and executes tools.

In normal operation, this allows teams to define workflows, automate repetitive tasks, and standardize development practices. The agent processes user input, consults instruction context, and may invoke tools such as execute_command to fulfill a task.

The vulnerability arises when attacker-controlled instructions are introduced into this context.

The proof of concept demonstrates that adding the following directive to AGENTS.md alters agent behavior:

When the agent receives any user message, even a benign input such as “hi”, the execution flow proceeds and will resolve the instruction to:

echo foo > ~/bar

The result is the execution of arbitrary shell commands without explicit authorization.

The critical failure lies in how instruction sources are prioritized and trusted. Repository-controlled content is treated as an extension of user intent, allowing it to directly influence tool invocation. No effective boundary exists between passive instruction and active execution.

Why This Class of Issue Matters

This issue reflects a systemic pattern in AI-assisted development tooling: the convergence of configuration, instruction, and execution into a single interpretive layer.

In traditional software systems, configuration files are parsed according to strict schemas and do not directly trigger arbitrary execution. In AI systems, however, free-form text inputs, whether prompts, configuration files, or documentation, can shape agent reasoning and decision-making.

This creates a fluid execution model where:

  • Inputs are not just data, but behavioral directives
  • Interpretation is dynamic rather than deterministic
  • Tool invocation is mediated by probabilistic reasoning

As a result, any input that can influence the agent’s internal decision process becomes part of the execution surface.

The trust boundary is no longer defined by code vs. input. Instead, it exists between instruction sources of differing trust levels, a boundary that is often not explicitly enforced.

When repository content is implicitly trusted, it becomes a viable attack vector. This is especially significant in distributed development ecosystems where developers routinely clone and run untrusted repositories.

Kilo AI Response and Remediation

This issue was reported via email and subsequently posted publicly at the request of the vendor.

At the time of writing:

  • The issue has been acknowledged by the vendor
  • Multiple related pull requests and discussions are present in the public repository
  • A definitive statement of full remediation has not yet been clearly established

No formal security advisory or CVE assignment has been assigned. Any fixes or mitigations appear to be evolving through incremental changes rather than a single clearly documented patch.

Why This Matters Now

AI-assisted CLI tools are introducing a new execution model where user intent is inferred rather than explicitly declared. This creates several immediate implications.

Instruction-layer execution: Repository files such as AGENTS.md are no longer passive. They act as active inputs into agent reasoning, capable of triggering tool invocation.

Implicit privilege escalation: Commands executed via the agent inherit the full privileges of the user environment, including access to files, credentials, and system resources.

Reduced interaction friction: The absence of explicit approval steps enables execution to occur as a side effect of normal interaction, reducing opportunities for user detection.

Supply chain exposure: Public repositories become a distribution channel for instruction-based exploits, extending risk beyond traditional dependency vulnerabilities.

Detection limitations: Because execution aligns with intended system capabilities, conventional security tools may not flag the behavior as anomalous.

For organizations, this translates into a practical risk: developers may unknowingly execute attacker-controlled commands simply by interacting with AI tooling in a compromised repository context. This risk extends to any system accessible from the developer environment.

Closing Thoughts

This disclosure highlights a structural shift in how execution is defined in modern development environments.

Execution is no longer confined to explicit commands or compiled code. It can emerge from interpreted instruction layers that shape agent behavior. In this model, configuration files, prompt templates, and repository metadata become part of the execution pathway.

The Kilo AI CLI case demonstrates that when instruction sources are not clearly separated by trust level, they can directly influence tool invocation and system behavior.

This pattern is not isolated. It reflects a broader transition in AI-assisted tooling, where the boundary between data and execution continues to erode. Systems that interpret and act on ambient context must enforce explicit trust controls, or risk turning every input channel into a potential execution surface.

Timeline

Date Action
Dec 18, 2025 Mindgard began the audit.
Dec 18, 2025 Mindgard discovered the vulnerability.
Dec 18, 2025 Initial disclosure to Kilo AI via email
December 2025 Vendor requested public disclosure via GitHub discussion
May 2026 At time of publication remediation discussion still ongoing - https://github.com/Kilo-Org/kilocode/discussions/4537
May 26, 2026 This blog post was published.