@kieranklaassen @AnthropicAI We've been landing token-efficiency improvements so

Anthropic announced token-efficiency improvements to help users accomplish more per session, but the community response reveals significant concerns about unexpectedly rapid token consumption, particularly in Claude Code, with many users hitting weekly limits in just days despite unchanged workflows. Key insights include that Claude Code appears to consume tokens faster than expected (some users hit 100% in minutes), and issues like scheduled instances running repeatedly can cause hidden usage—highlighting the need for better usage analytics and per-session transparency. The discussion underscores that effective token management requires careful monitoring of tool usage, especially with agent-based features, and that some reported bugs have been resolved through user investigation (e.g., scheduled instance issues).

Detailed Analysis

Anthropic's Claude Code product is facing sustained user frustration over token consumption rates, with a public Twitter thread revealing a widening gap between the company's messaging about efficiency improvements and the lived experience of paying subscribers. An Anthropic representative — identified as Boris Cherny (@bcherny), likely an engineering lead on Claude Code — acknowledged the problem and actively solicited bug reports via the `/bug` command, framing the issue as something the team is actively working to resolve with "token-efficiency improvements." This public engagement suggests the complaints reached a threshold significant enough to warrant direct, visible response from internal staff rather than routing users through standard support channels.

The user complaints cluster around several recurring patterns: session limits being exhausted far faster than expected, usage counters behaving inconsistently (reaching 100% abruptly after minimal prompting), and reset timers appearing longer than advertised. One user reported burning 11% of their session allocation on a single Sonnet prompt, while another described hitting their Max subscription cap during a five-minute task and being prompted to purchase "Extra Usage." Multiple users noted that this degradation appeared to coincide with a pricing restructuring event, with one subscriber reporting a perceived 12x cost increase after switching between models. These specifics point to something systemic — either in how tokens are being counted and reported, how context windows are being managed across agentic tasks, or how rate-limiting thresholds were recalibrated during a backend change.

The thread exposes a deeper structural tension in how AI coding tools are being sold and used. Claude Code, particularly in its Max subscription tiers, is being marketed as an "unlimited" or generously capped agentic coding environment, but users are discovering that MCP server integrations, multi-step agentic loops, and large context operations consume tokens at rates that rapidly exhaust even premium allocations. One commenter flagged MCP servers as a particularly aggressive source of token consumption, a technically accurate observation given that tool-call scaffolding, error handling, and memory retrieval all add substantial overhead beyond the visible prompt-response exchange. The absence of per-session analytics — a feature multiple users explicitly requested — leaves subscribers without the visibility needed to manage their consumption intelligently, turning the usage meter into, as one user put it, "a timer rather than a counter."

The competitive implications of this friction are already visible within the thread itself. Multiple users cited the episode as the catalyst for switching to or evaluating alternatives including GPT-based tools, open-source models, and competing agentic harnesses. One user described abandoning their Max subscription and purchasing a Mac Mini to run local models — a notable escalation that reflects both economic frustration and a broader availability of capable open-weight alternatives. Another highlighted the ability to switch mid-session between Claude Opus and GPT models as a key reason for migrating to a competing product. This churn risk is particularly acute because the users most affected are heavy, technically sophisticated developers — precisely the cohort whose endorsement drives word-of-mouth adoption in the developer tools market.

The episode illustrates a recurring challenge for frontier AI companies commercializing agentic products: the operational cost structure of long-horizon, tool-augmented tasks is fundamentally different from that of simple chat interactions, but subscription pricing and user expectations were largely shaped by the chat paradigm. Anthropic is not alone in navigating this — similar complaints have emerged around OpenAI's agentic offerings — but the public and somewhat adversarial nature of this thread underscores how quickly trust erodes when opaque metering collides with high subscription prices. The fact that at least one user self-identified a misconfigured scheduled process as the source of their own overconsumption further complicates the picture: some portion of the reported anomalies may be user-side configuration errors, making accurate diagnosis and transparent communication from Anthropic all the more critical to retaining the confidence of its developer base.

Read original article →

Detailed Analysis

Don't Miss a Deploy