← X

@quantumaidev @Hesamation Everyone gets the same default, and it’s sticky when y

X · bcherny · April 4, 2026
The conversation addresses default settings for Claude, which remain consistent across sessions except for the effort=max setting that can consume significant tokens. API-based access provides better cost optimization for running multiple agents in parallel compared to subscription plans, with users noting that infrastructure-level compute allocation differs between enterprise and individual accounts. Users report that Claude's performance fluctuates week to week regardless of pricing tier, suggesting factors beyond plan level influence output quality.

Detailed Analysis

A Twitter/X thread centered on the account @Hesamation sparked a notable public debate about how Anthropic's Claude handles model defaults, compute allocation, and subscription tier differentiation — with a key clarification emerging from what appears to be an Anthropic-affiliated account (likely Boris Cherny, @bcherny, a known Anthropic engineer) confirming that all users receive the same underlying models regardless of plan. The central technical claim under discussion is that Claude's settings are "sticky" across sessions — meaning user-configured preferences such as response style or effort level persist between conversations — with one notable exception: the `effort=max` parameter resets between sessions due to its high token consumption. This clarification was offered in direct response to speculation that Anthropic was silently serving degraded or "nerfed" models to lower-tier subscribers, a claim that was explicitly labeled false by the Anthropic-affiliated participant.

The thread surfaces meaningful tension between Claude's subscription tiers — specifically the $20 Pro plan and the higher-cost Max plan — and what users actually receive in exchange for the price difference. A recurring point made by technically sophisticated participants is that the Pro subscription imposes throughput caps that make parallel or agentic workloads impractical: running 12 agents simultaneously, for example, is simply not feasible under a consumer subscription. API access, by contrast, allows users to route different models to different tasks based on complexity, cache prompts, and optimize spend — with one participant claiming a 60% cost reduction through such task-matching strategies. This distinction between raw model quality and infrastructure flexibility is a nuanced one that many casual users miss, and the thread suggests the gap between "same model" and "same effective experience" is wider than Anthropic's public framing implies.

The `/effort` parameter — which can apparently be set to levels including `high` and `max` — emerges as a focal point for power users, with participants debating which combination of model and effort level produces the best results. References to "Opus 4.6 high" and flat per-request pricing through GitHub Copilot ($0.04–$0.12 per request) point to an emerging ecosystem of third-party access paths that circumvent the token-budget anxiety of direct Anthropic subscriptions. The observation that GitHub Copilot's Claude integration allows users to set effort to "High" with it remaining persistent — at a flat per-request cost with no usage caps — represents a meaningful arbitrage that technically sophisticated users are already exploiting, and one that puts pressure on Anthropic's own subscription value proposition.

Broader themes in the thread connect directly to a longstanding pattern in AI discourse: the recurring belief, dating back at least to GPT-4's early days, that model providers secretly degrade model quality over time or by tier. @Hesamation flatly dismisses this: "People claimed nerfed models since original gpt4 at least." The Anthropic-affiliated respondent reinforces this by stating the same models are served universally and pointing to a verifiable indicator that gave away the false nature of the original claim. This dynamic — where technically credible voices must repeatedly correct viral misinformation about model tiering — reflects a structural trust deficit between AI providers and their power-user base, one that transparency around parameters like `/effort` and their discoverability (as one participant notes, most users don't even know `/effort` exists) could meaningfully address. The "choice" to optimize one's experience, as that participant puts it, remains theoretical when the tools to exercise it are undiscoverable by default.

Read original article →