Opus 4.7 is insanely bad — Claude Learning Daily

A user reported that Anthropic's Opus 4.7 performs worse than its predecessor 4.6, generating verbose and token-heavy responses with irrelevant follow-up questions instead of the concise and clarifying interactions of the previous version. The outputs from 4.7 are described as either oversimplified or unnecessarily complex and nonsensical, suggesting poor execution of an apparent attempt to add more depth to the model. The reviewer expressed concerns about dependency on proprietary AI solutions and considered canceling their subscription.

Detailed Analysis

A Reddit user posting to r/Anthropic on or around late April 2026 voices sharp dissatisfaction with Claude Opus 4.7, contrasting it unfavorably against its predecessor, Opus 4.6. The user's core complaints center on verbosity, token inefficiency, and a perceived degradation in the quality and relevance of clarifying questions the model poses. Where Opus 4.6 reportedly handled iterative, complex modifications with concision and offered a focused form-based interface to narrow request scope, Opus 4.7 is characterized as rambling and unfocused — producing outputs that oscillate between oversimplified and nonsensically complex. The post reflects a user who relied heavily on the model for substantive work and now perceives a meaningful decline in day-to-day utility.

The broader reception of Opus 4.7, released on April 16, 2026, tells a more mixed story than the Reddit post alone suggests. Anthropic and independent reviewers largely position the model as a significant leap forward — particularly in agentic software engineering, achieving 10–15% better task success on difficult benchmarks like Factory Droids, improved vision capabilities supporting up to ~3.75 megapixel image analysis, and more reliable long-running task execution with self-verification. Reviewers at outlets like Tom's Guide describe it as a shift from "reactive chatbot to collaborative tool," praising its structured reasoning and precision in code review. However, specific and credible regression complaints do exist in the technical community: at least one developer reported context hallucinations, excessive token consumption, and degraded performance at production scale — concerns that closely mirror the Reddit poster's experience.

The tension between official benchmarks and individual user experience highlights a persistent challenge in large language model deployment: aggregate performance improvements do not uniformly translate across all use cases or user workflows. The poster's specific grievance about the clarifying-question interface — a feature they found genuinely useful in 4.6 — points to the risk that UX and interaction design regressions can undermine technical gains. If Anthropic restructured how Opus 4.7 surfaces follow-up prompts in favor of more verbose inline reasoning, users who valued the tighter, form-driven approach of 4.6 would rationally perceive a step backward even as the model scores higher on autonomous coding benchmarks.

The post's closing remarks extend beyond product criticism into something more philosophically charged: a concern about skill atrophy, dependency on closed, compute-heavy systems, and the suspicion that consumer-tier users are effectively providing training data that subsidizes more capable products for enterprise or high-paying clients. This anxiety reflects a broader undercurrent in the AI user community — particularly among power users who have integrated these tools deeply into their professional workflows. The "we end up with 4.7" framing encapsulates a fear that tiered AI access could create a two-speed ecosystem where the most capable and refined models are effectively reserved for well-resourced actors, while mass-market users receive versions optimized for breadth over precision. Whether or not that characterization is accurate in this case, it signals a trust and transparency gap that Anthropic, like other frontier AI labs, will need to actively manage as model versioning becomes more frequent and consequential.

Read original article →

Detailed Analysis

Don't Miss a Deploy