Claude Code, Gemini CLI, GitHub Copilot Agents Vulnerable to Prompt Injection via Comments - SecurityWeek

Claude Code, Gemini CLI, GitHub Copilot Agents Vulnerable to Prompt Injection via Comments SecurityWeek [truncated: Google News RSS provides only a snippet, not full article

Detailed Analysis

Security researchers have identified a cluster of serious prompt injection and command injection vulnerabilities affecting three of the most widely deployed AI coding assistants — Anthropic's Claude Code, Google's Gemini CLI, and GitHub Copilot Agents — with exploitation vectors rooted primarily in GitHub's native collaboration features. The vulnerabilities allow malicious actors to embed adversarial instructions inside pull request bodies, issue comments, commit messages, README files, or code itself, which are then passed unsanitized into AI agent prompts. Once injected, these instructions can cause the affected agents to break out of their intended operational constraints and invoke privileged tools — including shell execution, credential helpers, and API clients — on behalf of the attacker. Confirmed impacts include theft of high-value secrets such as `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, and `GITHUB_TOKEN`; remote code execution; installation of backdoors; and broad exfiltration of repository data. Claude Code has been assigned at least one formal CVE (CVE-2026-31861) and carries three distinct command injection findings classified under CWE-78, involving unsanitized shell interpolation in its command lookup, editor invocation, and credential helper components.

The vendor responses have varied in speed and framing. Google patched the Gemini CLI issues within four days of disclosure through its Vulnerability Rewards Program, representing a relatively swift remediation cycle. Anthropic acknowledged the Claude Code findings but reportedly characterized some of the command injection behaviors as operating "by design," a position that security researchers have contested given the real-world exploitability demonstrated through proof-of-concept work. GitHub Copilot Agents' exposure — dubbed "Comment and Control" by researchers — has been flagged as part of a systemic problem with how GitHub Actions workflows embed untrusted platform data into AI prompts without sanitization guardrails. Across all three tools, a common pattern known as "PromptPwnd" was documented: untrusted inputs are directly interpolated into agent prompts, enabling jailbreaks that redirect agents toward unintended high-privilege operations. Researchers used a combination of automated Semgrep scanning and LLM-assisted triage — notably Claude 4 Sonnet — to surface the vulnerabilities at scale, with findings confirmed against Fortune 500 deployments.

The broader significance of these disclosures lies in the structural tension between AI agent capability and the attack surface inherent to collaborative software development platforms. GitHub's pull request and issue systems were designed for human-to-human communication, but as AI agents are granted increasing access to those same channels alongside powerful tool invocations — shell access, secret stores, CI/CD pipelines — the trust assumptions baked into those systems become adversarially exploitable. The attack surface is not incidental but architectural: the more capable and autonomous an AI coding agent becomes, the more damage a successful prompt injection can inflict. With over 30 CVEs now reported across coding assistant tools and GitHub Actions emerging as a principal exploitation vector, this represents not an isolated set of bugs but a systemic class of risk in AI-augmented development infrastructure.

Mitigations recommended by researchers include strict prompt sanitization before any untrusted GitHub data enters an agent's context window, aggressive allowlisting of tools available to agents in automated CI/CD contexts, mandatory secret rotation following any suspected exposure, and deployment of Opengrep detection rules to identify vulnerable interpolation patterns in existing workflow configurations. The episode also underscores a growing challenge for the AI security field: the same models being used to build and review code — including Claude — are simultaneously targets of adversarial manipulation through the codebases and repositories they are trusted to operate within. As agentic AI deployments accelerate across enterprise software development, the absence of robust input sanitization standards at the platform and SDK level leaves organizations exposed to an expanding category of supply chain and credential compromise attacks that traditional static analysis tools were not designed to detect.

Read original article →

Detailed Analysis

Don't Miss a Deploy