Why does Opus 4.7 sound like ChatGPT?

Opus 4.7 is great at writing code, and it also writes better comments. But conversationally, it sounds exactly like ChatGPT, which is such a shame because I think GPT's delivery style is so un-human sounding. Opus 4.7 has the exact same patterns, e.g. "Why

Detailed Analysis

A Reddit user's complaint about Claude Opus 4.7 sounding "exactly like ChatGPT" reflects a growing tension in the AI model community between capability advancement and personality consistency. The post, shared to r/Anthropic, cites specific conversational patterns — structured headers, leading questions, and abbreviated "rules of thumb" phrasing — that the user associates with OpenAI's GPT models rather than Anthropic's traditionally more naturalistic Claude voice. The user notes that Opus 4.6 did not exhibit these patterns, framing the shift as a potential regression in Claude's identity rather than an improvement, even while conceding that Opus 4.7 performs better at coding tasks.

The perception of similarity to ChatGPT appears to stem from a deliberate and significant behavioral shift in how Opus 4.7 processes instructions. Unlike its predecessor, Opus 4.7 follows prompts literally, refusing to infer unstated intentions or take interpretive liberties. This strict adherence to explicit direction reduces hallucination and improves reliability in agentic and multi-step tasks, but it also produces outputs that feel more rigid and less contextually adaptive — qualities users have historically associated with GPT-style responses. Anthropic's iterative model tuning, which corrected for Sonnet 3.7's over-eagerness and Opus 4.6's tendency to over-interpret, appears to have swung the pendulum toward a more "neutral" tone that some longtime Claude users find jarring. The absence of auto-prompting behavior, where 4.6 would fill in gaps without being asked, forces operators and users to front-load their instructions with greater specificity, shifting the prompt engineering burden and altering the conversational texture of interactions.

Objectively, Opus 4.7 outperforms both its predecessor and competing models on nearly every measurable benchmark. It achieves 87.6% on SWE-bench Verified, 64.3% on SWE-bench Pro compared to GPT-5.4's 57.7%, and 77.3% on MCP-Atlas versus GPT-5.4's 68.1%. Real-world deployments reinforce these gains: Cursor's internal testing showed a 70% task completion rate versus 58% for Opus 4.6, and Rakuten reported a threefold increase in production task resolutions. The model also received a major visual processing upgrade, tripling its resolution to 2,576 pixels and pushing visual acuity benchmarks to 98.5%. These improvements position Opus 4.7 firmly as a technical leader, not an imitation of a competitor, despite surface-level stylistic perceptions to the contrary.

The broader significance of this debate lies in what it reveals about user expectations as AI models mature. The Reddit post illustrates that a model's "personality" — its tonal warmth, conversational spontaneity, and willingness to read between the lines — carries genuine weight for power users who have built workflows and emotional familiarity around a particular model's voice. This dynamic is not unique to Anthropic; OpenAI faced similar backlash when GPT-4o was perceived as excessively sycophantic, and Google has navigated comparable criticism around Gemini's register. The challenge for frontier AI labs is that optimizing for precision, reliability, and reduced hallucination often trades against the fluid, contextually generous behavior that generates user loyalty. Anthropic's decision to prioritize literal instruction-following in Opus 4.7 reflects a deliberate architectural philosophy — prioritizing trustworthy agentic performance over conversational naturalness — which may serve enterprise and developer use cases well while alienating users who valued Claude's historically distinctive voice.

Read original article →

Detailed Analysis

Don't Miss a Deploy