Anthropic investigating Claude Code usage limits hitting faster than expected

Detailed Analysis

Anthropic is actively investigating widespread complaints from Claude Code users who are exhausting their usage limits significantly faster than anticipated, a problem the company has acknowledged and designated as a top priority. Pro subscribers paying $200 annually have reported their token quotas depleting so rapidly that access is effectively available for only around 12 of every 30 days — limits maxing out each Monday and not resetting until Saturday. Users on Claude Max plans have similarly noted that 5-hour session windows are collapsing to just 1-2 hours under normal workloads, a marked regression from earlier behavior. Anthropic has publicly confirmed the issue but has not committed to a specific resolution timeline.

Technical investigation by the user community has surfaced two probable causes. First, users reverse-engineering the Claude Code binary identified a pair of bugs in the prompt caching system that effectively disable the cache, inflating token consumption by an estimated 10 to 20 times. A 5-minute cache expiration window compounds the problem, as even brief pauses during development sessions trigger full re-tokenization. Rolling back to version 2.0.61 of the npm package has served as a practical workaround for some affected users. Second, Anthropic appears to have introduced, without public announcement, more aggressive session-limit burn rates during peak hours — specifically weekdays between 5am and 11am Pacific Time — a change affecting roughly 7% of users, with Pro-tier developers running token-intensive tasks disproportionately impacted. The company maintains that weekly aggregate limits remain unchanged, but the redistribution of capacity within the week has functionally degraded day-to-day usability.

The episode exposes a structural transparency problem in how Anthropic communicates resource allocation to paying customers. The company does not disclose precise token limits, offering only vague guarantees such as "at least 5x" free-tier usage per session for Pro subscribers and 1.25x Pro for Team subscribers. This opacity leaves developers with no reliable way to plan workloads, budget sessions, or diagnose anomalous consumption — a dynamic that has driven users to crowdsource workarounds on Hacker News, Reddit, GitHub Issues, and Discord. The community-discovered fixes, including downgrading package versions and avoiding long conversation contexts, represent exactly the kind of informal compensatory knowledge-sharing that tends to emerge when vendor documentation and tooling fail. A similar rate-limit bug in February 2026 that required a manual reset suggests this is not an isolated incident but part of a pattern of reliability issues with the product's quota infrastructure.

The broader significance of the Claude Code usage-limit crisis lies in the competitive dynamics of the AI coding assistant market. Claude Code competes directly with GitHub Copilot, Cursor, and other tools that have made predictable, always-available access a baseline expectation. For professional developers, unpredictable capacity constraints are not merely an inconvenience but a workflow liability — one that shapes purchasing decisions and organizational adoption. Anthropic's decision to impose undisclosed peak-hour throttles without advance communication risks undermining trust at a critical moment when the company is scaling Claude Code toward enterprise use cases. The situation also highlights a tension inherent in flat-rate subscription pricing for large language model services: token consumption patterns vary enormously across users, and the economics of offering "unlimited" access to intensive coding workloads are structurally difficult to sustain without either transparent limits or tiered consumption pricing.

This incident fits into a wider trend of AI product companies grappling publicly with the gap between the aspirational framing of their subscription tiers and the operational realities of serving compute-heavy workloads at scale. OpenAI, Google, and others have all faced similar backlash over rate limits and capacity constraints as their products have matured beyond early adopters into mainstream developer tooling. What distinguishes the Claude Code situation is the degree to which the failure mode — a caching bug amplifying costs by an order of magnitude — appears to have been a preventable engineering regression rather than a simple capacity decision. How swiftly and transparently Anthropic resolves the issue will serve as a meaningful signal of its operational maturity as it positions Claude Code as a serious, enterprise-grade development tool.

Read original article →

Detailed Analysis

Don't Miss a Deploy