Why does 4.8 keep thinking when I've disabled the feature?

A user reported that Claude 4.8 continues to use the thinking feature despite having disabled it, causing unnecessary token consumption across all prompts. The issue persists even after refreshing the page and starting new conversations, while other model versions do not exhibit this problem.

Detailed Analysis

A user on the r/ClaudeAI subreddit has reported a persistent bug with Claude's extended thinking feature, specifically affecting the Claude 4.8 model, in which the thinking mode continues to activate and consume tokens even after the user has explicitly disabled it through the interface toggle. The issue reportedly persists across page refreshes and new conversation threads, indicating it is not a simple session-based glitch. The user notes that other Claude models do not exhibit this behavior, suggesting the problem is specific to the 4.8 release rather than a platform-wide interface issue.

The complaint highlights an important usability and cost concern. Extended thinking, a feature that allows Claude to perform more deliberate internal reasoning before responding, is a resource-intensive process that consumes additional tokens — and by extension, additional API credits or usage quota. When users disable this feature, they reasonably expect both the reasoning overhead and the associated token expenditure to cease. A failure to honor that toggle represents a breakdown in user control over a billable or rate-limited resource, which is particularly frustrating for developers or power users managing consumption carefully.

The behavior described points to a potential disconnect between the front-end toggle state and the underlying model configuration being sent with each API call or interface prompt. It is possible that Claude 4.8's default behavior includes a baseline level of extended reasoning that cannot be fully suppressed through the standard UI control, or that the toggle's state is not being correctly passed to the inference backend. This kind of bug — where a setting appears to be respected visually but is not functionally enforced — is a known category of issue in rapidly deployed AI products where model capabilities and interface controls evolve asynchronously.

More broadly, the report reflects a growing tension in AI development between capability defaults and user agency. As models like Claude become more powerful and reasoning-intensive, providers face pressure to surface those capabilities prominently — sometimes at the expense of clear, reliable opt-out mechanisms. The fact that this behavior is isolated to a specific model version suggests that Anthropic's rollout of Claude 4.8 may have introduced new default behaviors or configuration parameters that were not fully synchronized with the existing toggle infrastructure. User-facing control over compute-intensive features like extended thinking will likely become an increasingly critical product consideration as token costs and latency trade-offs grow more consequential for end users.

Read original article →

Detailed Analysis

Don't Miss a Deploy