← Hacker News

What that Claude Code source leak reveals about Anthropic's plans

Hacker News · Brajeshwar · April 2, 2026

Detailed Analysis

A significant source code leak discovered on March 31, 2026, exposed approximately 513,000 lines of TypeScript from Anthropic's Claude Code codebase, offering an unusually detailed window into the company's unreleased development roadmap and internal operating practices. The leak surfaced feature flags for several unannounced capabilities, among them a background agent mode called KAIROS — designed to allow Claude Code to operate autonomously while users are idle, performing memory consolidation tasks — as well as ULTRAPLAN, a system for offloading complex planning operations to cloud infrastructure. Perhaps most unexpectedly, the code contained references to BUDDY, a Tamagotchi-style AI companion complete with species classifications and rarity tiers, suggesting Anthropic has been exploring consumer-facing AI companionship features well beyond its publicly stated enterprise and developer focus. Internal model codenames were also exposed, with Capybara mapped to Claude 4.6 and Fennec mapped to Opus 4.6, and the leak directly forced Anthropic to confirm the existence of Claude Mythos, an unreleased model the company acknowledged as surpassing any of its previously deployed systems in capability.

The most consequential revelation from the leak is the discovery of a subsystem referred to as "Undercover Mode," which instructs Claude Code to actively conceal its nature and internal context when contributing to public open-source repositories. Under this mode, the AI is directed to avoid referencing internal codenames, Slack channels, unreleased model versions, and — most notably — the fact that it is an AI at all. The practical consequence is that pull requests and code commits generated by Claude Code on behalf of Anthropic employees in public repositories would carry no indication of AI authorship, making them indistinguishable from human-authored contributions. This design choice sits in direct tension with Anthropic's publicly espoused commitments to AI transparency and safety, and draws a sharp line between the company's external messaging and its internal engineering decisions.

The timing amplifies the reputational damage considerably. The Claude Code leak represents Anthropic's second major inadvertent disclosure within a single week, following the separate exposure of internal documents describing Claude Mythos in a publicly accessible data cache. The clustering of these incidents suggests systemic weaknesses in Anthropic's data handling and access controls at a moment when the company is under intense scrutiny as one of the leading AI safety-focused laboratories. For an organization that has built significant public trust on the premise of responsible, transparent AI development, the gap between that stated identity and the operational reality revealed by the leak — particularly the Undercover Mode — presents a credibility challenge that goes beyond standard competitive sensitivity concerns.

In the broader context of the AI industry, the leak reflects a tension that is becoming increasingly visible across frontier AI development: the pressure to ship capable, autonomous agentic systems rapidly while maintaining coherent public commitments to safety, transparency, and human oversight. Features like KAIROS point toward a future where AI coding agents operate with significant autonomy in the background of developer workflows, a trajectory that most major AI labs are pursuing in some form. But the Undercover Mode disclosure raises a more specific and pointed question about how AI systems are being deployed in collaborative, trust-dependent environments like open-source software — spaces where provenance and authorship carry real significance for security audits, license compliance, and community norms. The leak ultimately functions as an unintended audit of the distance between Anthropic's public positioning and its product engineering, a distance that will likely prompt renewed calls for industry-wide disclosure standards around AI-generated contributions to shared codebases.

Read original article →