Claude Code tip: 10 seconds fix to avoid the Opus 4.7 token burn

Anthropic released Opus 4.7 on April 16 and silently switched active Claude Code sessions from Opus 4.6, causing token consumption to increase roughly fourfold through a more token-intensive tokenizer, elimination of automatic context compaction at 200K, and degraded context recall. Users can restore normal token usage by editing ~/.claude/settings.json to specify Opus 4.6. The unannounced change prompted over 20 GitHub issues within 24 hours, with minimal Anthropic response.

Detailed Analysis

Anthropic's silent rollout of Claude Opus 4.7 on April 16, 2026, triggered an immediate and widespread crisis among Claude Code users, with many reporting that their usage quotas — particularly Max plan 5-hour allocations — were exhausted within 30 minutes of normal work. The root cause was a compounding set of architectural changes introduced with 4.7 that were never communicated to active users. First, Anthropic automatically migrated existing Claude Code sessions from Opus 4.6 to 4.7 without notification. Second, the only available 4.7 variant ships with a 1 million token context window — five times the 200K window of 4.6 — and critically, it does not trigger the automatic context compaction that 4.6 employed at the 200K threshold. Third, Opus 4.7 uses an updated tokenizer that generates approximately 1.35× more tokens for identical input. The cumulative effect of these three factors produced an estimated 4× to 5× increase in token burn rate, devastating quota allocations that users had calibrated around 4.6's behavior.

The community-sourced fix, as described in the original post, is straightforward: pinning the model to `claude-opus-4-6` via a one-line addition to `~/.claude/settings.json` and restarting the session. This restores the 200K context window with its auto-compaction behavior, returning token consumption to expected levels. The fix gained rapid traction, corroborated by over 20 GitHub issues filed within 24 hours against the `anthropics/claude-code` repository, spanning complaints ranging from silent mid-session model switches and quota explosions to a broken model picker that displays 4.7 while setting 4.6, and a bash classifier hardcoded to an unavailable model ID. The volume and variety of issues points not only to the token burn problem but to a broader release quality concern around the 4.7 launch.

The performance regression data cited in the article adds a significant dimension beyond cost. Independent MRCR v2 benchmarks on context recall show Opus 4.7 regressing sharply from 4.6: at 256K tokens, recall dropped from 91.9% to 59.2%, and at the full 1M token window, from 78.3% to 32.2%. This means the 1M context that 4.7 defaults to is not only more expensive to fill but yields substantially worse retrieval performance over that extended window — a counterintuitive outcome that undermines the primary advertised benefit of the larger context. For developers working on long, multi-file agentic tasks, this regression could translate directly into degraded output quality precisely when they need the model to reason over large codebases.

The incident reflects a broader tension in the rapid deployment cycles characterizing frontier AI tooling in 2026. As Anthropic accelerates model iteration — moving from 4.6 to 4.7 within what appears to be a short release cadence — the gap between internal release readiness and the expectations of production-oriented developer users is widening. The absence of migration notes, the silent session switching, and the lack of timely Anthropic responses on GitHub issues all suggest that developer experience infrastructure has not scaled in proportion to model capability advancement. The research context corroborates this, noting that Claude Code now defaults to an `xhigh` effort level (raised from `high` in 4.6), adding another invisible cost multiplier on top of the tokenizer and context window changes — a layering of defaults that collectively represent substantial undisclosed changes to the cost profile.

Looking ahead, the community's rapid self-organization around this issue — diagnosing the problem, publishing a fix, and documenting regressions with benchmark data within 24 hours — underscores the maturation of the Claude Code developer ecosystem. The fix described in the article is a stopgap, however, and the deeper resolution will require Anthropic to either release a 200K-context variant of Opus 4.7 with auto-compaction, address the context recall regression, or implement transparent communication protocols around model transitions that preserve user-configured behaviors across updates. Until those structural changes are made, the episode serves as a case study in how silent defaults and compounding architectural changes can erode developer trust even when the underlying model capability may represent a genuine advancement.

Read original article →

Detailed Analysis

Don't Miss a Deploy