← Reddit

Claude Usage limit bug?

Reddit · BulogHD · May 1, 2026
A user questioned how their Claude account reached its usage limit despite displaying specific usage statistics in screenshots from their account's usage page, suggesting a possible discrepancy between displayed usage and the limit enforcement.

Detailed Analysis

A widespread bug affecting Claude's usage limit system has drawn significant attention across developer communities, with users on multiple subscription tiers — particularly Max and Max 20x plans — reporting that their 5-hour session quotas drain far more rapidly than intended. The issue, which began surfacing prominently around March 23, 2026, and has persisted into May 2026, manifests in extreme ways: some Max 20x subscribers report their usage jumping from 21% to 100% on a single prompt, while Max 5x users find their expected 5-hour sessions exhausted within one to two hours. The Reddit post in question reflects a common experience — a user hitting their usage ceiling despite their usage dashboard appearing to show consumption well below the limit — which has been replicated by dozens of users across GitHub issues and community forums.

The technical roots of the bug appear to be multifactorial. A primary driver is prompt cache invalidation behavior in newer Claude models, particularly Opus 4.6, which features a 1 million token context window with cache expiry every five minutes. When a user returns to a session after even a brief break, the system may register a massive token recalculation — potentially 250,000 tokens or more — effectively burning a large percentage of a session's quota on what amounts to a session resumption. This compounds heavily for users running agentic or automated workflows through Claude Code, where iterative scripts and batch codebase operations generate token loads far beyond what conversational usage would produce. Anthropic has also acknowledged that it actively adjusts 5-hour session limits during peak demand hours (roughly 5–11am PT / 1–7pm GMT), meaning real-world effective quotas are lower than advertised during high-traffic weekday windows.

This is not Anthropic's first encounter with this class of problem. Less than a month prior to the current surge in reports, the company reset Claude Code rate limits companywide after a separate prompt caching bug caused quota drain that was similarly abnormal. That precedent is notable: Anthropic did issue a system-wide reset in that instance, though as of early May 2026 it has not yet done so for the current bug. Support teams have stated they cannot manually reset or bypass individual usage limits, directing affected users to wait for natural quota resets or upgrade their plans — a response that has frustrated subscribers who are paying premium Max-tier prices and receiving materially less compute access than contracted.

The broader significance of this issue lies in what it reveals about the operational complexity of managing large-scale AI infrastructure with token-based metering systems. As Anthropic pushes Claude into increasingly agentic use cases — particularly through Claude Code, which is designed for extended, multi-step software development tasks — the gap between user expectations formed by chat-style interactions and the token economics of agentic workflows becomes a serious product reliability problem. A user running a codebase refactor loop has fundamentally different consumption patterns than one asking single-turn questions, and the current quota architecture does not appear to gracefully communicate or account for that distinction in real time.

The episode underscores a tension central to the current moment in AI product development: the race to expand model capabilities and context windows (Opus 4.6's 1M token context being a prime example) is outpacing the robustness of the billing and rate-limiting infrastructure designed to govern access to those capabilities. Until Anthropic either patches the cache invalidation accounting, revises how agentic session quotas are calculated, or provides more granular real-time usage transparency, Max-tier subscribers using Claude Code for professional workflows face an unpredictable and unreliable service experience — one that is particularly damaging given that the product's core value proposition rests on sustained, uninterrupted developer productivity.

Read original article →