← Reddit

Limit reached in 5 minutes

Reddit · Elektrik-trick · April 9, 2026
This morning, Claude Code set a new record for the time limit. I ran a prompt that I use every now and then (for administrative tasks). Normally, it uses up about 4–6% of the 5-hour limit. This time, not only did it use up the entire 5-hour limit within 5

Detailed Analysis

Claude Code users across multiple subscription tiers began reporting in early 2026 that their session usage limits were being consumed at dramatically accelerated rates, with some subscribers exhausting the full 5-hour rolling cap within minutes of beginning work. The Reddit post in question exemplifies a pattern affecting a meaningful portion of the user base: a task that historically consumed 4–6% of the available session limit instead consumed the entire allotment in under five minutes, halting work mid-execution. Anthropic publicly acknowledged the situation, with the company stating that users hitting usage limits "way faster than expected" had become "the top priority for the team," a notable admission that the degradation was systemic rather than isolated.

Several converging technical and operational factors explain the sudden spike in token consumption. Claude Code's prompt cache carries a short 5-minute expiration window, meaning that even a brief interruption in an active session can invalidate a cached context, forcing the system to reprocess potentially hundreds of thousands of tokens from scratch upon resumption. A user on the Max 5 plan ($100/month) reported that simply resuming work after a cache expiry on a 250,000-token context consumed roughly 10% of a session limit in a single re-initialization. Compounding this, automated workflow users faced a particularly severe failure mode: rate-limit errors triggered silent retry loops that could drain an entire daily budget in minutes without any user-visible warning or interruption mechanism.

Anthropic also quietly reduced session allocations during weekday peak hours — specifically 5 a.m. to 11 a.m. PT — to manage infrastructure demand, a change disclosed not through official documentation but via a statement from engineer Thariq Shihipar. The reduction affected approximately 7% of users and was not formally announced at rollout, leaving many subscribers unaware that their effective quota had been reduced during the hours most likely to coincide with professional workday usage. A Pro subscriber paying $200 per year reported being able to use Claude productively for only roughly 12 days per month as a result, an outcome that starkly contradicts the value proposition of a paid productivity tool.

The episode highlights a structural tension in deploying large-scale AI coding assistants under subscription models: token consumption in agentic, long-context workflows scales non-linearly in ways that differ fundamentally from traditional software-as-a-service usage patterns. Unlike a word processor or a search engine, Claude Code's resource consumption is tightly coupled to task complexity, context window size, cache state, and network behavior — variables that are difficult for end users to monitor or control. The absence of real-time consumption visibility or automatic warnings before limit exhaustion transformed what might have been a manageable inconvenience into a workflow-stopping failure, particularly for developers running multi-step automated pipelines.

Broader trends in AI development make this problem increasingly acute. As frontier models grow more capable and are integrated deeper into developer workflows — handling code generation, test execution, repository navigation, and iterative debugging in extended autonomous sessions — the gap between user expectations around subscription pricing and actual infrastructure costs is likely to widen. Anthropic's response, prioritizing the issue at the team level and acknowledging the mismatch publicly, reflects a growing industry reckoning with the fact that agentic AI products require substantially different consumption accounting, quota design, and user communication strategies than their simpler, turn-based predecessors.

Read original article →