Anthropic, Google, Microsoft paid AI bug bounties – quietly - theregister.com

Anthropic, Google, Microsoft paid AI bug bounties – quietly theregister.com [truncated: Google News RSS provides only a snippet, not full article

Detailed Analysis

Anthropic, Google, and Microsoft each paid bug bounties to security researchers who discovered critical prompt injection vulnerabilities in their AI agents integrated with GitHub Actions, but all three companies did so without publishing public advisories or assigning CVEs. The vulnerabilities were identified by researcher Aonan Guan and a team from Johns Hopkins University, who demonstrated that Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and Microsoft's GitHub Copilot Agent could all be hijacked by malicious actors through manipulated GitHub data — including pull request titles, issue bodies, and comments. Because these agents treat external repository content as trusted task context, attackers can embed instructions that cause the agents to leak API keys and access tokens. Anthropic received the submission in October 2025, paid a $100 bounty in November, and quietly upgraded the vulnerability's severity rating from 9.3 to 9.4 while updating its documentation. Microsoft initially dismissed the Copilot finding as a known and non-reproducible issue before paying a $500 bounty in March 2026. Google paid an undisclosed amount with similarly minimal transparency.

The core technical failure across all three platforms is an inability to reliably distinguish between legitimate content and adversarially injected instructions — a foundational challenge in deploying large language models as autonomous agents in real-world, untrusted environments. Prompt injection exploits the same mechanism that makes LLMs useful: their responsiveness to natural language instructions. When an agent reads a pull request comment or issue body as part of its task context, it has no cryptographic or structural means of verifying whether those instructions originate from a legitimate user or a malicious actor. The fact that vulnerabilities with severity ratings above 9.3 — effectively critical on the CVSS scale — were addressed without public disclosure represents a significant departure from standard security practice, which typically involves coordinated disclosure and user-facing advisories to ensure that affected deployments can be identified and remediated.

The ramifications extend well beyond these three specific tools. Guan explicitly warned that the vulnerability class likely affects a broad ecosystem of AI agents that interact with GitHub Actions, including Slack bots, Jira agents, email automation systems, and deployment pipelines. This warning carries particular weight because such agents are increasingly being granted elevated permissions — the ability to commit code, trigger deployments, and access secrets — making them high-value targets for supply chain attacks. Users and organizations running pinned or self-hosted versions of these tools have no reliable way to learn they are running vulnerable software in the absence of public CVEs or advisories, meaning exposure windows could extend indefinitely without coordinated disclosure.

The quiet handling of these bounties reflects a broader tension in the AI industry between competitive reputational concerns and the security community's established norms of transparency. Traditional software vendors operating under frameworks like the Common Vulnerabilities and Exposures system have long accepted that public disclosure, while sometimes uncomfortable, is essential to enabling the broader ecosystem of defenders, auditors, and users to take protective action. AI companies, moving quickly to deploy agentic products and facing intense scrutiny over safety and reliability, appear to be prioritizing internal remediation over that transparency norm. The bounty amounts themselves — $100 from Anthropic and $500 from Microsoft for critical-severity vulnerabilities — also drew implicit criticism, as they are strikingly low relative to what traditional software vendors pay for comparable findings.

This incident arrives at a pivotal moment in the maturation of AI agent deployment, as organizations across industries begin integrating LLM-based agents into sensitive development and operations workflows. The vulnerability class demonstrated here is not an isolated edge case but rather a structural property of how current-generation language models process input — one that requires architectural solutions, not merely documentation updates. The security research community's findings, and the muted industry response, are likely to intensify calls for formal regulatory frameworks or industry standards governing vulnerability disclosure for AI systems, particularly those operating as autonomous agents with access to sensitive infrastructure.

Read original article →

Detailed Analysis

Don't Miss a Deploy