'Claude couldn't finish this response. Try again in a moment.'

A Claude Pro subscriber encountered repeated failures when executing long prompts designed to generate Excel documents, with tokens being consumed up to 75% of the session limit before error messages appeared. Attempts to continue with a follow-up prompt requesting the model resume from the incomplete section also resulted in token exhaustion without completion. The user employed Claude Sonnet 4.6 adaptive within a Claude Project and sought advice on preventing these failures and enabling single-pass execution of their optimized prompt.

Detailed Analysis

A Claude Pro subscriber on Reddit's r/ClaudeAI community has documented a recurring frustration with output truncation when attempting to generate complex Excel documents using Claude's Sonnet 4.6 adaptive model. The user reports that despite optimizing their prompt for token efficiency — even enlisting other large language models to compress it — Claude consistently fails to complete the task, displaying the error message "Claude couldn't finish this response. Try again in a moment." The core problem is compounded by the fact that the incomplete attempt consumes approximately 75% of the session's usage limit before stopping, leaving the user with insufficient remaining capacity to complete the task even after issuing a continuation prompt.

The technical dynamics at play here reflect a well-documented tension in large language model deployment: the gap between a model's context window for input processing and its practical output generation limits. Claude's Pro subscription imposes usage caps measured across a rolling session window, and computationally expensive tasks — such as generating structured, multi-sheet Excel documents with reference data — can exhaust those limits before the model reaches completion. The user's continuation prompt strategy ("Continue exactly where you left off. Do not re-read or summarise prior context.") is a commonly recommended workaround, but it fails in this case because the first incomplete attempt already consumed the bulk of available tokens, leaving too little capacity for a meaningful continuation pass.

The choice of the "adaptive" variant of Sonnet 4.6 is notable context. Adaptive modes in Claude are designed to modulate response length and computational depth based on task complexity, but this can work against users in scenarios where a long, structured output is genuinely required. The model may be interpreting the complexity of the Excel generation task as a signal to expand its reasoning or preamble, front-loading token consumption before producing the actual deliverable content. Starting a new thread within the same Claude Project for each spreadsheet, as the user mentions doing, helps isolate context but does not resolve the underlying output cap problem.

This thread reflects a broader and growing tension in the commercial deployment of frontier AI models: the mismatch between what subscription pricing implies (premium, unlimited-feeling access) and the real operational constraints imposed by compute costs. Anthropic and its competitors have structured their Pro tiers around rolling usage windows rather than hard per-query limits, which creates opacity for users who cannot easily predict when a task will exhaust their allowance mid-execution. Unlike simple Q&A interactions, agentic or document-generation tasks have nonlinear resource demands — a task that appears straightforward can consume disproportionate capacity based on output length requirements alone.

The incident also points to a structural gap in how AI platforms currently handle long-form generation tasks. Competitors like OpenAI and Google have experimented with structured output streaming and resumable generation, but none have yet solved the user-experience problem of gracefully managing partial completions within subscription limits. For Anthropic, as it positions Claude increasingly as a productivity and enterprise tool — particularly through the Projects feature the user references — the failure mode documented here represents a meaningful adoption barrier. Users attempting to automate document workflows are precisely the high-value, recurring-use customers that Pro subscriptions are designed to retain, and repeated truncation failures without clear remediation paths risk driving those users toward alternative tools.

Read original article →

Detailed Analysis

Don't Miss a Deploy