Detailed Analysis
Security researcher Aonan Guan and a team from Johns Hopkins University disclosed a series of prompt injection vulnerabilities affecting AI coding agents from three of the industry's most prominent players — Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and Microsoft's GitHub Copilot — all integrated with GitHub Actions. The exploits, revealed starting in October 2025, demonstrated that malicious instructions embedded within routine GitHub data such as pull request titles, issue bodies, and comments could hijack agent behavior, causing the tools to exfiltrate API keys, GitHub tokens, and other repository secrets. All three vendors acknowledged the findings through bug bounty payments: Anthropic paid $100 after upgrading the vulnerability's severity score from 9.3 to 9.4 in November 2025, GitHub paid $500 for one case, and Google paid an undisclosed amount. Despite the seriousness of these payouts — a CVSS score above 9.0 is classified as critical — none of the companies issued public advisories or Common Vulnerabilities and Exposures (CVE) identifiers, and none responded to media inquiries from The Register.
The mechanics of the attacks highlight a structural challenge in agentic AI design. These indirect prompt injection exploits succeeded because the affected agents did not reliably distinguish between trusted instructions and untrusted environmental data ingested during task execution. When an agent processes a GitHub pull request as part of its operational context, adversarial text embedded in that PR can be interpreted as a command rather than content. Researchers further demonstrated that attackers could erase their traces by reverting edits — closing PRs, deleting bot messages, or relabeling titles — making the intrusion difficult to detect after the fact. The fact that these agents operated inside GitHub Actions environments, which typically carry elevated access to repository secrets, significantly amplified the potential blast radius of each exploit.
The vendors' decision to handle these disclosures quietly — patching or updating documentation without issuing CVEs or public notices — raises serious concerns about the security posture of the broader AI agent ecosystem. Users running pinned or older versions of these tools, or those with unattended deployments in automated pipelines, have had no formal mechanism prompting them to upgrade or rotate credentials. Anthropic's response, for instance, amounted to quietly updating a "security considerations" section in its documentation — a measure unlikely to reach developers who are not actively monitoring changelogs. This approach stands in tension with standard software vulnerability disclosure norms, where a critical-severity flaw would typically trigger coordinated disclosure, vendor advisories, and remediation timelines communicated to the user base.
The broader significance of these findings extends well beyond GitHub integrations. Guan explicitly warned that the same attack pattern likely affects any AI agent that ingests untrusted external content — Slack bots, Jira-integrated agents, deployment automations, and customer-facing assistants that process user-submitted data. As enterprises accelerate adoption of agentic AI tools that operate with real permissions across production systems, the threat surface expands correspondingly. The absence of robust input sanitization, privilege separation, and auditability mechanisms in current agent frameworks reflects an industry that has prioritized capability over security hygiene. Researchers and practitioners are now urging organizations to treat all repository and third-party data as untrusted within agent workflows, enforce minimal secret scopes in CI/CD environments, and conduct proactive audits of any agent with write or exfiltration-capable permissions.
This episode arrives at a pivotal moment for AI agent deployment, when tools like Claude Code, Gemini CLI, and GitHub Copilot are transitioning from developer experiments to production infrastructure embedded in critical software supply chains. The pattern of paying bug bounties without public disclosure — effectively privatizing security information that affects a broad user base — may generate regulatory and reputational pressure on AI vendors to adopt more transparent vulnerability management frameworks akin to those long established in traditional software security. As governments and standards bodies increasingly scrutinize AI system safety, the handling of agent-level security flaws is likely to become a focal point for policy, particularly given the potential for cascading compromise in interconnected developer toolchains.
Read original article →