Detailed Analysis
A Reddit user posting in r/ClaudeAI has surfaced a significant practical limitation of Claude's Opus 4.7 Extended model: running complex workflows in German — or likely any non-English language with morphologically dense vocabulary — can consume session token limits dramatically faster than equivalent English-language prompts. The user, a Pro subscriber, reported that an identical analytical task completed at roughly 33–37% of session capacity in English across Opus 4.6 and Opus 4.7, but exhausted 100% of the Opus 4.7 session almost instantaneously when run in German. Sonnet and Opus 4.6 handled the German-language version of the same prompt without triggering the limit, suggesting the problem is amplified specifically in Opus 4.7's context consumption behavior.
The root cause, as Claude itself explained when the user queried it directly, lies in how tokenization works for morphologically complex languages. English averages roughly one token per 0.75 words, while German — with its compound nouns, umlauts, and comparatively lower representation in training data — can average one token per 0.5 words or worse. A word like "Aktienmarktanalyse" (stock market analysis) fragments into significantly more tokens than its English counterpart. For the same semantic content, a German-language exchange can consume 1.5× to 2× the tokens of an English one. When compounded by the complexity of the specific task — which involved multi-timeframe stock forecasting, options data, 13F filing analysis, chart generation, and Excel output — the token multiplication effect became catastrophic under session limits.
This matters because Claude.ai's session limits are token-based rather than message-based, meaning multilingual users are structurally penalized relative to English-language users paying identical subscription prices. The disparity is not merely a performance quirk but an economic and access issue: a German-speaking Pro subscriber running a sophisticated analytical workflow effectively receives a fraction of the usable session capacity compared to an English speaker executing the same task. The user's observation that Opus 4.6 and Sonnet handled the same German prompt without hitting the wall suggests that Opus 4.7 Extended either processes more verbose internal reasoning, consumes tokens at a higher rate per operation, or has a lower effective session ceiling — all of which compound the tokenization penalty for non-English users.
Anthropic has acknowledged the multilingual token-cost disparity as a structural property of the tokenizer rather than a fixable bug, which places it in a broader category of foundational limitations in large language model deployment. Modern LLM tokenizers, most commonly variants of byte-pair encoding (BPE), are calibrated primarily against English-dominant training corpora, resulting in systematically less efficient encoding for languages that are either morphologically richer or underrepresented in pretraining data. This is a well-documented phenomenon across the industry — affecting GPT models, Gemini, and others — but it takes on heightened significance as frontier models like Opus 4.7 are positioned for complex, professional use cases where session limits directly constrain workflow completion. The practical workaround suggested — prompting in German but requesting English-language output — illustrates the kind of friction non-English users must absorb, and raises longer-term questions about whether Anthropic and the broader industry will invest in language-balanced tokenization strategies as global adoption grows.
Read original article →