Detailed Analysis
Anthropic has developed an advanced AI model called Mythos that demonstrates such exceptional and dangerous capabilities in cybersecurity exploitation that the company has declined to release it to the general public, opting instead for a tightly controlled preview program. Through an initiative called Project Glasswing, Anthropic is providing access exclusively to over 40 technology and cybersecurity firms, as well as security-focused organizations including the Open Source Security Foundation (OpenSSF) and the Apache Software Foundation. To support defensive applications of the model, Anthropic is backing participating organizations with up to $100 million in usage credits. CEO Dario Amodei has explicitly stated that broad public release will remain on hold until adequate safeguards exist to contain the model's most dangerous functionalities — a posture that marks one of the most significant voluntary restrictions a major AI lab has placed on its own technology.
The capabilities that prompted this caution are substantive and well-documented. The UK's AI Security Institute evaluated Mythos and confirmed it is significantly more capable of facilitating complex cyberattacks than comparable systems from OpenAI and Google, particularly against poorly defended targets. What distinguishes Mythos is not merely its raw power but its ability to perform multi-step attack chaining — linking multiple software vulnerabilities into coordinated, compounding exploits — a technique that previously required the skill of an elite human hacker. The model can also identify and exploit zero-day vulnerabilities in open-source codebases and reverse-engineer exploits from closed-source software, capabilities that dramatically expand the attack surface for both state and criminal actors. Anthropic's own internal assessments compound the concern: over 99% of vulnerabilities discovered during testing remain unpatched, meaning the knowledge encoded in Mythos maps closely onto real, live weaknesses in global infrastructure.
The broader threat environment in which Mythos arrives makes the stakes of its release decisions particularly acute. China has already reportedly leveraged Anthropic's existing models to automate surveillance operations targeting 30 entities, and cybercriminals have used AI systems to generate functional ransomware. Perhaps most alarming is the compression of the exploitation window: the average time between vulnerability disclosure and a functional attack has collapsed from 771 days in 2018 to under four hours today. Mythos, with its capacity to automate complex exploit chains at scale, could theoretically reduce that window further — or eliminate it entirely for certain classes of vulnerabilities. This context explains why even a controlled preview, rather than full release, represents a calculated and contentious risk management decision.
Anthropic's handling of Mythos situates the company at the center of a growing tension within the AI industry between competitive development pressure and safety-driven restraint. Notably, reports indicate that OpenAI and other major technology companies are developing models with comparable cybersecurity capabilities, raising the prospect of a race dynamic where unilateral restraint by one actor does little to reduce aggregate global risk. The implicit challenge embedded in the article's framing — contrasting Anthropic's cautious posture with OpenAI's continued development — reflects a structural dilemma the field has not yet resolved: whether safety-conscious withholding of a dangerous capability is strategically durable when competitors are converging on the same capability without equivalent hesitation.
Anthropic's Mythos moment may ultimately represent a defining test case for how AI labs navigate the publication and deployment of genuinely dual-use models at the capability frontier. The Project Glasswing approach — restricted access, defensive partnerships, significant financial subsidies for protective use — constitutes a novel governance framework that falls somewhere between full secrecy and open release. Whether this middle path proves adequate will depend on how well Anthropic can prevent lateral proliferation, how quickly the security community can deploy Mythos defensively relative to how fast adversaries can replicate its capabilities independently, and whether the broader industry coalesces around comparable access controls. The precedent being set here will likely shape how frontier AI developers handle future models whose capabilities outpace existing societal and institutional safeguards.
Read original article →