← Reddit

Switching between effort and models

Reddit · seh0872 · June 2, 2026
An experiment with Claude's browser interface demonstrated that users can now switch between effort levels and models within the same conversation thread, as well as toggle the Thinking feature on and off between prompts. Although the AI is aware of its current model version and thinking status, it cannot internally assess which effort level is active, making it impossible to verify whether effort level changes actually reduce token consumption.

Detailed Analysis

Claude.ai has introduced interface-level flexibility that allows users to switch between AI models and effort settings within a single, continuous conversation thread — a capability that previously required starting an entirely new conversation. A user experimenting with the browser-based claude.ai interface documented several specific behavioral changes: model switching mid-conversation is now permitted, the "Thinking" toggle (labeled "Extended" for Haiku-tier models) can be enabled or disabled between individual prompts, and these changes appear consistent across both the browser client and the Claude desktop application on macOS.

A notable technical nuance emerges around Claude's self-awareness of its operational parameters. The model appears to have reliable introspective access to which version it is running and whether the Thinking or Extended reasoning toggle is active. However, it cannot identify which "Effort" level — a setting that presumably governs token budget allocation for a given response — is currently in effect. This asymmetry raises a practical concern flagged by the observer: if a user switches from a High to a Low effort setting mid-conversation hoping to reduce computational expenditure on a simpler prompt, there is no internal mechanism by which the model can confirm that the lower token budget is actually being applied. The user's ability to verify the behavioral impact of effort-level switching is therefore limited to external observation rather than any model-reported confirmation.

This development reflects a broader trend in frontier AI product design toward greater conversational continuity and user-controlled inference flexibility. Historically, switching models in chat interfaces required context resets, which disrupted long-running conversations and forced users to choose their model before beginning any substantive work. Allowing mid-thread model and setting changes positions claude.ai more competitively alongside interfaces that have offered analogous flexibility, and it aligns with Anthropic's ongoing expansion of the Claude model family — including the availability of multiple tiers such as Haiku, Sonnet, and Opus — which creates a natural user need to shift between capability and cost profiles within a single session.

The gap between UI-level controls and model-level self-knowledge also points to a meaningful architectural distinction worth monitoring. Effort levels, unlike model identity or reasoning mode toggles, appear to be parameters set at the inference infrastructure layer without being surfaced to the model itself as part of its system context. This is consistent with how many commercial AI deployments handle resource-allocation parameters — they are operator-facing rather than model-facing — but it creates friction for power users who want confirmation that their configuration choices are producing the intended computational effects. As Anthropic continues to build out tiered inference products, closing this observability gap could become an important UX priority, particularly for users managing cost-performance tradeoffs across complex, multi-stage workflows within a single conversation.

Read original article →