Detailed Analysis
A user subscribed to Anthropic's highest-tier "Max 5x" plan reported experiencing unexpectedly rapid token consumption with Claude Opus 4.7 through Claude Code, describing two separate incidents in a single day: hitting usage limits after just one message upon waking, and exhausting 37% of available usage within 17 minutes of a session dedicated solely to generating a simple PowerShell script. The user emphasized they had no background processes consuming tokens — a cron-based session had logged zero overnight activity — and noted they had been using Claude Code since January without encountering similar issues, suggesting the behavior represented a meaningful departure from prior experience rather than a routine limitation. A subsequent edit to the post indicated that downgrading the model tier failed to resolve the issue, with 100% consumption occurring in under ten minutes.
The incident points to a broader tension in how Anthropic has architected Claude Opus 4.7's token economy. Unlike its predecessors, Opus 4.7 incorporates extended thinking capabilities and more sophisticated agentic reasoning loops, both of which can generate substantial internal token overhead that may not be intuitively visible to users during a session. Anthropic has introduced mitigation tools — including a new `xhigh` effort parameter, an advisory task budget mechanism, and a hard `max_tokens` ceiling — but these are developer-facing controls surfaced through the API, not consumer-facing transparency features built into Claude Code's interface. A user generating what they perceive as a trivial artifact, such as a single PowerShell script, may unknowingly trigger extensive internal reasoning chains that consume tokens at rates disproportionate to the observable output.
This disconnect between perceived task complexity and actual token consumption reflects a structural challenge in deploying frontier reasoning models at the consumer tier. As models grow more capable through internal chain-of-thought processing, the computational and financial costs of a single interaction can vary dramatically based on the model's autonomous decisions about how deeply to reason — decisions the user neither directs nor observes. The Max 5x plan, by name, implies a quantifiable premium resource pool, yet if the model's internal architecture can exhaust that pool in under twenty minutes of light scripting, the pricing tier's communicative value to the user breaks down.
The broader trend at play is the growing misalignment between model capability marketing and user-legible resource accounting. Anthropic, like other frontier AI labs, is racing to ship increasingly powerful reasoning systems while the tooling for transparent, real-time token consumption monitoring lags behind. The research context confirms Anthropic's awareness of the issue — the explicit recommendation to use `high` or `xhigh` effort levels for coding tasks, paired with warnings that these controls may trade off model intelligence, reveals a known tradeoff the company has yet to fully surface to consumer-tier subscribers. Until per-session token dashboards, effort-level selectors, and task budget controls are exposed natively within products like Claude Code, reports of unexplained usage spikes are likely to persist and multiply as Opus 4.7 adoption broadens.
Read original article →