可算有解决Claude降智和偷Token的神配置了

A user documented configuration settings for local Claude Code designed to address performance degradation and excessive token consumption, including modifications to disable adaptive thinking and set fixed thinking token limits. The article provides quantifiable metrics to detect performance degradation, such as read/modification ratios and thinking depth measurements, and notes that different Claude models experience performance variations at specific times. Additionally, usage habits like maintaining conversations within the cache window rather than starting new ones can significantly reduce token consumption.

Detailed Analysis

A Chinese-language Reddit post on r/ClaudeAI has attracted significant attention by proposing a set of configuration tweaks for Claude Code aimed at mitigating two widely reported user grievances: perceived degradation in Claude's reasoning quality (colloquially termed "降智," or "intelligence downgrade") and unexpectedly rapid token consumption. The post was published in the context of Anthropic launching Claude Routines — a 24/7 agentic workflow feature — while simultaneously rolling out Opus 4.6, a model that prompted immediate user complaints about sluggish reasoning and excessive token burn rates. The proposed fix centers on modifying the local `~/.claude/settings.json` configuration file with four key parameters: setting `effortLevel` to `"high"` to invoke stronger reasoning, enabling `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING` to prevent the model from dynamically reducing its own thinking budget, capping `MAX_THINKING_TOKENS` at 31,999 (with an optional ceiling of 128k), and disabling the 1M-token context window in favor of compressing context every 200,000 tokens. The author also warns against setting `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1`, noting that while intended to stop usage data from reaching Anthropic, this flag silently reduces the one-hour context cache validity window to just five minutes — a significant hidden cost for subscription users who rely on that cache to avoid re-spending tokens on system prompts and project memory.

The degradation phenomenon the post attempts to quantify is corroborated by independent research and user analysis. According to tracked metrics, Claude's median thinking depth dropped from approximately 2,200 characters to around 560 characters beginning in early 2025 — a 67% reduction — shifting the model's behavior from "read extensively, then modify" to "glance and act," with an observed read-to-edit ratio falling from approximately 6.6:1 to 2:1. The post offers three diagnostic signals users can monitor: the read-modify ratio during code edits, the character count of reasoning traces in Plan mode, and the frequency with which Claude interrupts mid-task to ask whether to continue. These are practical, observable heuristics that allow users to self-diagnose degradation without relying solely on subjective impression. The author also points to a third-party site (aistupidlevel.info) that aggregates crowd-sourced model performance data, which reportedly identifies recurring performance dips for Opus 4.6 around 7 PM and 11 PM — times consistent with peak server load.

Broader research context confirms that the degradation problem is not purely perception-driven. Anthropic officially acknowledged that between August and September 2025, three infrastructure bugs caused a subset of short- and long-context requests to be routed to incorrect servers, affecting roughly 30% of Claude Code users as well as portions of Bedrock and Vertex AI traffic. A patch deployed on September 4th reportedly restored quality. Separately, version-specific issues with Claude Code have been documented, with newer releases (1.0.81 and above) introducing heavier system-prompt overhead that increases cognitive load and contributes to apparent reasoning degradation; rolling back to versions such as 1.0.51 or 1.0.112 has been cited as an effective short-term remedy. The convergence of infrastructure bugs, version regressions, and adaptive thinking throttling creates a layered problem that no single configuration can comprehensively address, which is why the community has developed a portfolio of mitigations rather than a single authoritative fix.

The token economy dimension of the post reflects a growing user frustration with the resource costs of agentic AI workflows. The author's observation that each new conversation session consumes 40,000 to 60,000 tokens just to reload system prompts, project memory, and plugins highlights a structural inefficiency in stateless LLM architectures when used in persistent, tool-heavy environments. This cost is largely invisible to end users who measure usage by task completion rather than token accounting. The advice to avoid opening new conversations when a task is continuous, and to rely on the one-hour cache window, represents practical financial hygiene for heavy Claude Code users. The broader implication is that as Anthropic pushes toward agentic, multi-step AI workflows, the token overhead of context initialization becomes a non-trivial operating cost that the product design has not yet made transparent or easily manageable.

The post concludes with a sardonic suggestion that OpenAI "distill" Claude into GPT, framing GPT Pro as a lower-friction alternative — a sentiment that reflects competitive tension at the frontier of developer-facing AI tools. This framing underscores a broader industry dynamic in which user trust in a given model is fragile and easily disrupted by inconsistent performance, even when that inconsistency stems from infrastructure issues rather than fundamental model quality changes. Anthropic's challenge is not merely technical: the perception of degradation, whether or not it maps precisely to measurable capability decline, directly influences developer loyalty in a market where switching costs between Claude, GPT-4o, and Gemini are relatively low. Configuration-level workarounds shared by power users, while valuable, also signal a gap in official tooling and transparency that Anthropic will need to close as agentic deployment scales.

Read original article →

Detailed Analysis

Don't Miss a Deploy