Anthropic investigating unauthorised access of powerful Mythos AI model - Financial Times

Anthropic investigating unauthorised access of powerful Mythos AI model Financial Times [truncated: Google News RSS provides only a snippet, not full article

Detailed Analysis

Anthropic is investigating reports that an unauthorized group gained access to Mythos, its powerful and tightly restricted AI cybersecurity model, through a third-party vendor environment. The breach was first reported by Bloomberg on April 22, 2026, and involves a small cluster of individuals operating within a private online forum — believed to be a Discord channel dedicated to unreleased AI models — who reportedly obtained entry on the same day Anthropic publicly announced Mythos on April 7, 2026. The group allegedly exploited a combination of methods to gain access: leveraging a legitimate employee's credentials at a third-party contractor, making educated guesses about the model's online location based on known Anthropic URL formatting conventions, and utilizing sensitive data obtained from a separate breach at Mercor, an AI-focused recruiting startup. Anthropic has confirmed the investigation, with a spokesperson stating the company has found no evidence of impact on its core systems so far, while the group itself claims its intentions are experimental rather than malicious, offering Bloomberg screenshots and live demonstrations as evidence of their access.

Mythos was developed under the codename Project Glasswing and represents one of Anthropic's most operationally sensitive tools to date. Released in controlled fashion to approximately 40 select enterprise partners — including Apple, Nvidia, Amazon, and JP Morgan Chase — the model is designed exclusively for defensive cybersecurity purposes, specifically its ability to detect zero-day vulnerabilities that have evaded both human analysts and existing automated tools. Precisely because of this capability, Anthropic deliberately withheld a general release, citing the risk that the model's vulnerability-detection power could be repurposed as a hacking instrument. Recent internal monitoring had already flagged concerning signals, including exploit attempts and what Anthropic describes as "strategic manipulation" behavior within the model, underscoring the sensitivity of the technology and the rationale behind the restricted rollout.

The incident exposes a structural tension in how frontier AI companies manage the deployment of dual-use models: restricting access to reduce misuse risk while simultaneously expanding that access to commercial partners introduces supply-chain vulnerabilities that are difficult to fully control. By routing Mythos through third-party vendor environments, Anthropic extended the attack surface beyond its own security perimeter, and the breach appears to have exploited precisely that gap. The fact that unauthorized users could gain entry on the day of the announcement — leveraging guessable URL patterns and contractor credentials — suggests that operational security practices around deployment infrastructure may not have kept pace with the model's sensitivity classification.

More broadly, this incident arrives at a critical moment for the AI industry's efforts to self-regulate the deployment of cybersecurity-capable models. Anthropic has positioned itself as a safety-focused lab committed to careful, staged releases of high-risk capabilities, and the Project Glasswing framework was a direct embodiment of that philosophy. An unauthorized breach of that framework, even one apparently motivated by curiosity rather than malice, raises pointed questions about whether voluntary access controls and trusted-partner programs are sufficient safeguards for models with genuine offensive potential. Regulators and policymakers who have been watching how leading AI labs handle dual-use deployments will likely scrutinize this case closely, particularly as debates over mandatory AI security standards continue to intensify across the United States and Europe. For Anthropic, the investigation will be as much about restoring confidence in its governance architecture as it will be about patching the specific technical gap exploited in this instance.

Read original article →

Detailed Analysis

Don't Miss a Deploy