Leaked Code for Anthropic’s Claude Code Tests Copyright Challenges in A.I. Era - The New York Times

Leaked Code for Anthropic’s Claude Code Tests Copyright Challenges in A.I. Era The New York Times [truncated: Google News RSS provides only a snippet, not full article

Detailed Analysis

Anthropic's Claude Code AI coding agent became the center of a significant intellectual property and security incident on March 31, 2026, when a build misconfiguration caused the accidental publication of approximately 512,000 lines of unobfuscated TypeScript source code to the public npm registry. The leak originated through a faulty release of the `@anthropic-ai/claude-code` v2.1.88 package, which included a 59.8 MB source map file containing the complete client-side agent harness. Security researcher Chaofan Shou first disclosed the exposure on X, after which the code spread rapidly — downloaded from Anthropic's Cloudflare R2 bucket, mirrored across GitHub repositories that collectively accumulated over 84,000 stars and 82,000 forks, and ported into languages including Python and Rust. Anthropic confirmed that no user data was compromised in the incident, making it a proprietary code exposure rather than a privacy breach.

Anthropic responded by issuing DMCA takedown notices targeting GitHub repositories and forks hosting the leaked TypeScript, successfully removing some mirrors. However, the effort faces an inherently difficult containment problem: the code now persists across hundreds of public repositories, and the decentralized nature of modern open-source hosting makes comprehensive removal practically unachievable. Beyond the enforcement logistics, cybersecurity firm Zscaler flagged active threat actor analysis and redistribution of the leaked code, with malicious forks reportedly containing trojans, backdoors, and cryptominers — introducing supply chain attack vectors that could affect developers who inadvertently install compromised versions of the package.

The copyright enforcement campaign is complicated by a legally novel and structurally ironic problem: Anthropic has publicly acknowledged that approximately 90% of Claude Code's source was itself generated by Claude. This detail intersects directly with a March 2025 DC Circuit ruling holding that AI-generated works do not automatically qualify for copyright protection, raising serious questions about whether Anthropic can assert enforceable copyright over code it did not meaningfully author in the traditional legal sense. Additionally, a Python "clean-room rewrite" of the leaked code — dubbed `claw-code` — may constitute a sufficiently transformative derivative work in a new programming language to escape DMCA liability, since it is not a direct copy of stolen circumvention technology but rather a reimplementation of publicly accessible logic.

The incident also places Anthropic in a structurally contradictory legal position. The company is currently defending itself in at least two significant intellectual property actions: a $1.5 billion authors' lawsuit filed in September 2025 over the use of copyrighted works in training data, and a June 2025 scraping suit brought by Reddit. Aggressively pursuing copyright litigation over the Claude Code leak risks establishing precedents or making legal arguments that could undermine Anthropic's own fair use defenses in those cases, where the company's posture relies on broad interpretations of permissible use of third-party content. Legal strategists and observers note that any courtroom victories on Claude Code's copyright could create unfavorable precedent that opponents in the training data suits would readily cite.

At a broader level, the Claude Code incident crystallizes one of the defining tensions of the current AI development era: the companies most reliant on expansive access to others' intellectual property for model training are simultaneously the most motivated to enforce strict copyright protections over their own outputs and proprietary systems. The leak invites scrutiny of whether existing copyright frameworks — designed for human authorship and conventional software development — are equipped to adjudicate disputes involving code that is predominantly machine-generated, rapidly redistributed across decentralized platforms, and legally entangled with the very fair use debates its creators are simultaneously embroiled in. The outcome of any litigation stemming from this incident could establish consequential precedents governing the ownership and enforceability of AI-generated intellectual property across the industry.

Read original article →

Detailed Analysis

Don't Miss a Deploy