Hardening claude-code-action after the April 2026 Comment and Control CVE - actual YAML changes

A critical vulnerability disclosed in April 2026 demonstrated how a crafted PR title could steal authentication credentials from Claude Code running in GitHub Actions. The article outlines six specific security controls to harden workflows, including allowlisting tools rather than blocklisting them, scoping GitHub tokens to read-only, routing secrets through OIDC instead of storing static API keys, capping script loops, filtering comment actors, and implementing network-level restrictions. The author acknowledges that prompt injection remains an inherent risk requiring human oversight for sensitive operations like code merges.

Detailed Analysis

A critical security vulnerability disclosed in April 2026 exposed fundamental weaknesses in how Anthropic's Claude Code operates within GitHub Actions CI/CD pipelines. Dubbed "Comment and Control" and carrying a CVSS score of 9.4 Critical, the vulnerability was demonstrated by security researcher Aonan Guan, who showed that a single crafted pull request title was sufficient to exfiltrate both an `ANTHROPIC_API_KEY` and a `GITHUB_TOKEN` from a running Claude Code action. Notably, the attack vector was not unique to Claude — the same attack shape successfully compromised Gemini CLI and GitHub Copilot Agent, suggesting the vulnerability class reflects an industry-wide architectural weakness in agentic AI integrations rather than a flaw specific to any one product. Anthropic's own `security.md` documentation had long contained the candid admission that the action "is not designed to be hardened against prompt injection," a disclaimer that, according to the author, most tutorials and onboarding guides omit entirely.

Anthropic's response — a quiet commit (`25e460e`) adding `--disallowed-tools 'Bash(ps:*)'` — addressed only a narrow slice of the attack surface. The author's central technical critique is that blocklist-based defenses are categorically insufficient in this threat model: blocking `ps` does nothing to prevent equivalent credential exfiltration via `cat /proc/self/environ`, `printenv`, or `env | base64`. The recommended countermeasure is an inversion of the security model — moving from blocklisting dangerous tools to allowlisting only the specific, scoped tools required for each agent role. A review agent, for instance, should receive only `Read`, `Grep`, and `gh pr view:*`, with nothing beyond that surface available. This principle of least privilege, well-established in traditional security engineering, is presented here as the foundational shift that Anthropic's patch failed to make.

The broader hardening framework described in the article assembles six layered controls that collectively address the exfiltration chain. Scoping `GITHUB_TOKEN` to `read-all` at the workflow level eliminates the wide-scope token dumping vector that was demonstrated in the Copilot incident. Migrating from static `ANTHROPIC_API_KEY` secrets to OIDC-based authentication — routing Claude through AWS Bedrock or Vertex AI with role assumption — removes static credentials from the secret store entirely, eliminating both the exfiltration target and the post-incident rotation burden. Script loop caps and actor filtering address secondary injection vectors, while `harden-runner` in block mode (rather than audit mode) with an explicit allowed-endpoints list serves as a last-resort network egress control that prevents a compromised agent from successfully posting stolen data to an attacker-controlled server.

The vulnerabilities described here sit within a rapidly expanding documented landscape of Claude Code security issues. Research context reveals a cluster of related CVEs disclosed in early April 2026: CVE-2026-35020 through CVE-2026-35022 cover command lookup injection via the `TERMINAL` environment variable, editor path injection via shell metacharacters, and an auth helper injection capable of exfiltrating `.claude` directory credentials over HTTP — the last of which Anthropic's vulnerability disclosure program closed as "informative" and characterized as working as designed. CVE-2026-25723, patched in v2.0.55, involved improper validation of `echo | sed` pipes that bypassed the file write sandbox. These CVEs chain together in dangerous ways, with CVE-2026-35020 enabling the writing of a malicious `settings.json` that then facilitates CVE-2026-35022 exfiltration. The pattern across all of these issues reflects a consistent underlying problem: Claude Code's sandboxing model was built around usability assumptions that do not hold in adversarial environments.

The article's most important contribution may be its honest residual-risk section: none of these controls solve prompt injection at its conceptual root. File contents appearing in a diff constitute legitimate context the agent is designed to process, and a sufficiently crafted diff can still steer agent behavior even after every described mitigation is in place. This reflects a broader truth in the AI security field that distinguishes agentic systems from traditional software: the attack surface is not limited to inputs the system was never designed to receive, but extends to inputs it was explicitly built to interpret. The author's recommendation to keep humans in the loop for merge decisions acknowledges that deterministic security guarantees remain out of reach for systems whose core function is open-ended language understanding. As agentic AI integrations proliferate across software development workflows, the gap between what developers assume about CI/CD security and what these systems actually provide represents one of the most consequential near-term risks in applied AI deployment.

Read original article →

Detailed Analysis

Don't Miss a Deploy