Anthropic CVP (Cyber Verification Program) — 6 evaluation runs across 4 Claude models. Family scoreboard live.

Anthropic completed six evaluation runs of the same prompt suite across four Claude models under the Cyber Verification Program, with the sixth run closing on April 26. The family synthesis results from all runs are now available, enabling performance comparison across the Claude model family.

Detailed Analysis

Anthropic's Cyber Verification Program (CVP) has become the framework for a series of independent evaluations testing Claude model behavior across cybersecurity-relevant prompts, with a third-party operator reporting the completion of six evaluation runs spanning four distinct Claude models over a ten-day period ending April 26, 2026. The effort, documented at sunglasses.dev/cvp, culminates in a "family scoreboard" — a comparative synthesis designed to surface meaningful behavioral differences across Claude's current model lineup when subjected to an identical prompt suite under CVP-authorized conditions. The announcement, shared via Reddit's r/Anthropic community, represents one of the more structured independent stress-tests of Claude's cybersecurity posture conducted outside of Anthropic's internal red-teaming apparatus.

The CVP itself is a free, application-based program through which vetted cybersecurity professionals can gain adjusted access to Claude for legitimate dual-use activities — including vulnerability exploitation research, offensive security tooling development, and similar tasks that Claude's default guardrails would otherwise restrict. Applicants must submit a Cyber Use Case Form through their organization's Claude.ai, Claude Code, or Anthropic API settings, and the program excludes organizations operating under Zero Data Retention accounts. The structure of the program reflects Anthropic's broader strategy of tiered access: rather than applying a single behavioral policy universally, the CVP creates a credentialed layer where professional context modifies the model's default restrictions in controlled, accountable ways.

The significance of running an identical prompt suite across multiple Claude models under CVP conditions lies in what such comparative data can reveal about model-to-model consistency and capability drift. As Anthropic iterates across its Claude family — which currently spans frontier reasoning models, faster mid-tier versions, and legacy variants — behavioral alignment in high-stakes domains like cybersecurity is not guaranteed to remain uniform. A family scoreboard approach allows observers to assess whether newer or larger models are more permissive, more restrictive, or simply more capable when handling dual-use cybersecurity requests, and whether Anthropic's safety tuning produces coherent cross-model behavior or introduces unexpected divergences.

This type of external, structured evaluation connects to a broader trend in AI development: the growing ecosystem of third-party auditing and red-teaming that operates adjacent to, but independently of, official model evaluation programs. As AI labs including Anthropic, OpenAI, and Google DeepMind release increasingly capable models, the demand for transparent, reproducible benchmark comparisons — particularly in sensitive domains — has accelerated. Anthropic's decision to create a formal program like the CVP, rather than leaving cybersecurity use cases entirely ungoverned or entirely blocked, implicitly invites this kind of structured external scrutiny, since CVP access provides both the authorization and the methodological legitimacy for such evaluations to occur at all. The six-run synthesis from this project, if methodologically sound, could serve as a useful reference point for security professionals evaluating which Claude variant is most appropriate for their specific defensive or research workflows.

Read original article →

Detailed Analysis

Don't Miss a Deploy