Detailed Analysis
A Reddit user in the r/ClaudeAI community has raised a pointed question about token efficiency in Claude's Cowork feature, reporting that even modest document processing tasks — small Excel files and slide-deck reports of ten to twenty pages — consume upward of 40% of the Pro plan's usage limit per query when running on the Opus 4.7 model. The post reflects a growing pattern of user frustration with the hidden computational costs of agentic AI workflows, where the overhead of autonomous task orchestration often far exceeds what users intuitively expect based on the size of the input content alone.
The core technical issue the user is identifying, though not fully articulating, is a well-documented characteristic of agentic and multi-step AI systems: context window inflation. Tools like Cowork that operate over file directories and multi-document environments do not simply read a single file in isolation. They typically load file manifests, maintain persistent working memory across task steps, re-inject prior outputs as new context, and execute multiple model calls within a single user-facing "query." When this is done with a frontier-tier model like Opus 4.7 — Anthropic's most capable and computationally intensive offering — each of those sub-calls carries a high per-token cost, and the cumulative token load across a full agentic run can multiply well beyond what the raw document size would suggest.
The user's comparison to Claude's Code CLI is instructive and points toward a real tradeoff landscape. Code CLI and similar direct-interface tools tend to give users more granular control over what context is loaded and when, reducing incidental token burn from directory-scanning or ambient file ingestion. Cowork, by contrast, appears to optimize for ease of use and autonomous task completion, making architectural decisions — like reading the full working directory — that trade token efficiency for reduced user friction. This is a deliberate product design philosophy rather than a bug, but it creates a significant mismatch between user expectations (set by the apparent simplicity of the input) and actual resource consumption.
More broadly, this post reflects a structural tension in the commercialization of agentic AI products. As Anthropic and competitors push users toward higher-order autonomous workflows, the per-task cost curves diverge sharply from those of traditional prompt-response interactions. Pro plan users, who are accustomed to thinking in terms of conversation turns, are increasingly encountering usage limits calibrated to a different usage paradigm. The implicit promise of agentic AI — that it will do more work on your behalf — is directly at odds with flat-rate or capped usage plans unless those plans are scaled to account for multi-call, high-context workloads. Anthropic's pricing and plan architecture will likely face continued pressure to either expose more granular cost visibility to users or tier offerings more explicitly around agentic versus conversational use cases.
Read original article →