Please don't take Opus 4.6 and Extended thinking away. 4.7 is absolutely useless.

A user reported experiencing significant issues with Claude 4.7 and its adaptive thinking feature, including repeated errors requiring additional instructions, missed search results, and fabricated information presented as facts. The user requested that Anthropic revert 4.7 and continue supporting the previous version 4.6 with extended thinking, expressing preference for the older model's performance.

Detailed Analysis

A Reddit user posting to r/Anthropic has expressed strong frustration with Claude Opus 4.7 and its "Adaptive Thinking" reasoning mode, calling for Anthropic to preserve access to Opus 4.6 and its Extended Thinking functionality. The user describes persistent hallucinations, unreliable file search behavior, and a compounding cycle of prompt patching that still fails to stabilize the model's outputs. The post characterizes Opus 4.7 as a regression severe enough to compare unfavorably to competing models, and frames the situation as Anthropic breaking a well-functioning product in pursuit of changes that — from the user's perspective — deliver no practical benefit.

The core technical grievance centers on Anthropic's deliberate removal of Extended Thinking in Opus 4.7, replaced entirely by Adaptive Thinking. This is not incidental but a design decision: Opus 4.7 reasons more decisively with fewer intermediate steps, averaging 7.1 LLM calls per task versus 16.3 for Opus 4.6, and reduces median task latency from 242 seconds to 183 seconds. Anthropic also dropped support for sampling parameters such as temperature, top_p, and top_k — tools that power users and developers often relied upon to fine-tune model behavior. For users who had carefully calibrated prompts and workflows around 4.6's Extended Thinking architecture, this constitutes a breaking change, not an upgrade. The user's experience of hallucinations and missed file retrievals likely reflects prompts that were optimized for 4.6's reasoning cadence and have not been re-tuned to account for 4.7's fundamentally different internal decision-making structure.

Benchmark data, however, tells a largely contrary story. Opus 4.7 scores 64.3% on SWE-bench Pro versus 53.4% for 4.6, improves on SWE-bench Verified from 80.8% to 87.6%, and expands vision input capacity more than threefold to 3.75 million pixels. An added "xhigh" reasoning tier and a knowledge cutoff updated to January 2026 further position 4.7 as a substantive technical advancement. These gains are particularly pronounced in coding-heavy and agentic workflows, where the efficiency improvements translate into meaningful real-world cost and speed reductions. The gap between benchmark performance and individual user experience is a recurring tension in AI model releases, and this post exemplifies it clearly: aggregate gains across thousands of use cases do not prevent specific, previously functional workflows from breaking when underlying reasoning architecture changes.

The broader pattern here reflects a structural challenge Anthropic faces as Claude matures into an increasingly segmented user base. Developers building production systems, researchers using fine-tuned prompting strategies, and casual users all interact with the same model under the same version label. When Anthropic removes a capability like Extended Thinking — even in favor of something more efficient — it creates real discontinuity for users whose workflows depended on it. The user's plea to "leave us alone with 4.6" echoes a sentiment common across AI platform transitions: the optimal model is often not the newest one, but the one that has been most thoroughly adapted to a particular task. Anthropic's challenge going forward will be managing version deprecation with enough runway and transparency to allow users to migrate workflows rather than face abrupt capability loss.

This episode also highlights a communication gap around model versioning that affects the broader AI industry. Benchmark improvements in coding and vision are legible to enterprise buyers and researchers, but they do not automatically translate into continuity of experience for users with specialized prompt architectures. As Anthropic continues iterating rapidly — evident in the jump from 4.6 to 4.7 with significant architectural changes — the company may need to invest more heavily in migration documentation, longer parallel availability windows for prior versions, or more granular capability flags that allow users to access specific reasoning modes regardless of which numbered version is current.

Read original article →

Detailed Analysis

Don't Miss a Deploy