Claude Mythos - update and system card

Claude Mythos Preview is a new class of intelligence model released as a gated research preview with prioritized access for defensive cybersecurity use cases. The model features adaptive thinking capabilities and strong vision skills to process images and text, excelling in cybersecurity vulnerability detection, autonomous coding across full engineering cycles, and long-running agent tasks sustained over extended periods. Training was conducted through December 2025, and the model supports diverse input and output formats spanning multiple languages including English, Mandarin Chinese, Spanish, Japanese, and others.

Detailed Analysis

Anthropic's Claude Mythos Preview represents a significant architectural and capability leap beyond the company's prior flagship models, positioning itself as a dedicated tool for high-stakes technical domains including cybersecurity, autonomous software engineering, and long-horizon agentic tasks. Released in early 2026 as a gated research preview — internally codenamed "Capybara" before leaking in late March via a misconfigured CMS — the model introduces "adaptive thinking," an evolution of Anthropic's extended thinking paradigm that dynamically calibrates reasoning depth to task complexity rather than applying a fixed compute budget. The model accepts both image and text inputs, supports a wide range of output formats, and draws on a training corpus with a cutoff at the end of December 2025. Access is deliberately restricted, with priority given to defensive cybersecurity practitioners, a choice that reflects Anthropic's calculated attempt to manage the model's pronounced dual-use risk profile.

The cybersecurity benchmarks associated with Claude Mythos are among the most striking in the model's public record. On CyberGym, a vulnerability analysis benchmark, Mythos scores 83.1%, compared to 66.6% for Claude Opus 4.6 — a gap of nearly 17 percentage points that represents a qualitative jump in automated security reasoning. In applied testing, the model identified thousands of high-severity zero-day vulnerabilities spanning every major operating system and web browser, as well as critical media libraries like FFmpeg, including codec-level bugs in H.264, H.265, and AV1 that had escaped detection across five million automated tests. Some vulnerabilities uncovered date back decades, including a 27-year-old flaw in OpenBSD. Mythos converts 72.4% of identified vulnerabilities into working exploits — achieving register control in 11.6% of cases — and can develop in hours what expert penetration testers typically require weeks to produce. As of the model's preview launch, 99% of these vulnerabilities remain unpatched.

To manage the risks inherent in deploying a system of this capability, Anthropic launched Project Glasswing, a coordinated pre-disclosure partnership with a consortium of major technology and security firms including AWS, Apple, Cisco, CrowdStrike, Google, Microsoft, NVIDIA, and Palo Alto Networks, among others. The initiative channels Mythos's vulnerability discovery capabilities toward remediation before public exposure, giving institutional defenders a window to patch critical systems ahead of any potential adversarial exploitation. This approach reflects a broader philosophy Anthropic has articulated around its most powerful models: restricted deployment now can help harden global infrastructure for a future in which equivalent capabilities will inevitably be more broadly accessible. The 244-page system card accompanying the release signals the depth of internal deliberation over the model's deployment conditions.

Claude Mythos fits within a broader competitive and strategic inflection point in frontier AI development, where leading labs are deploying models not merely as productivity assistants but as autonomous technical agents capable of operating over multi-hour task horizons with minimal human intervention. The model's framing around "long-running agents" — sustaining coherent execution across extended, evolving tasks — reflects an industry-wide shift toward agentic architectures in which AI systems manage complex workflows end-to-end rather than responding to isolated queries. In the cybersecurity domain specifically, Mythos marks what security researchers have characterized as a qualitative threshold: the arrival of AI that can independently traverse the full offensive security pipeline from discovery through exploitation, compressing timelines and lowering the expertise bar in ways that fundamentally alter the threat landscape. Anthropic's decision to pair that capability with tight access controls and a coordinated patching initiative represents one model for how labs might attempt to sequence the societal exposure to dual-use AI breakthroughs, though whether such controls prove durable as similar capabilities proliferate across competing systems remains an open and urgent question.

Read original article →

Detailed Analysis

Don't Miss a Deploy