Why do people keep complaining about Claude not working?

A developer described improving results with ClaudeCode through specific strategies including CONTINUE.md for tracking mistakes and state, Hooks to prevent context drift and repetitive tool calls, and Skills to enhance performance on specific tasks. By implementing these techniques, the developer built a functional ecosystem around Claude that delivers better outcomes and solicited other users' approaches and opinions.

Detailed Analysis

Claude's reliability and performance have become a focal point of discussion across developer communities, with user frustrations stemming from a combination of infrastructure strain, capacity limits, and shifting perceptions of output quality. A Reddit thread in r/ClaudeAI captures this tension directly: while many users report dissatisfaction, others — particularly those working extensively with ClaudeCode — argue that much of the friction can be mitigated through deliberate workflow engineering rather than interpreted as a fundamental product failure. The original poster describes a progression from frustration to proficiency, attributing improved results to structured techniques rather than any change in the underlying model.

The complaints documented across social media, GitHub, and outage trackers are not without merit. Anthropic's infrastructure has faced visible stress under surging demand, with a notable outage on April 13, 2026, disrupting Claude.ai and Claude Code for nearly an hour, and a broader disruption in early March affecting login systems, developer APIs, and the web interface simultaneously. Capacity throttling — where Anthropic proactively reduces throughput during peak periods — has generated user-facing errors and timeouts that are particularly disruptive for developers mid-workflow. These are structural challenges tied directly to rapid adoption outpacing infrastructure scaling, a pattern common across AI platforms that have seen sudden mainstream uptake.

Beyond outages, a more nuanced complaint concerns perceived quality degradation. GitHub issue trackers show an upward trend in quality-related reports through early 2026, with April on pace to exceed March's totals. This perception gap is significant: while benchmark performance on tools like SWE-Bench-Pro appears stable for models such as Opus 4.6, subjective user experience often diverges from benchmark outcomes. This discrepancy points to the difficulty of translating controlled evaluation metrics into the messiness of real-world, open-ended coding and reasoning tasks. Hallucinations remain a persistent concern as well, particularly in domains where Claude's training data may be outdated or where users expect grounded citations that the model cannot reliably provide.

The Reddit poster's proposed solutions — CONTINUE.md files for state persistence, hooks for preventing context drift, and skill modules for task-specific tuning — represent an emerging class of power-user workflows designed to compensate for the inherent limitations of large language model sessions. Context drift, where a model loses coherence or regresses on earlier instructions over a long session, is a well-documented challenge across all frontier models. The techniques described essentially externalize memory and enforce behavioral guardrails that the model cannot maintain autonomously. That experienced users are independently converging on these patterns suggests both the genuine capability ceiling of current AI coding assistants and the degree to which their practical utility is mediated by user sophistication.

This dynamic reflects a broader trend in the AI industry: the gap between a model's raw capability and its reliable, production-grade performance is increasingly being bridged not by the model itself, but by scaffolding, tooling, and workflow discipline built around it. Anthropic's development of Claude Code as an agentic coding environment acknowledges this reality, embedding structured tool use and session management directly into the product. However, the persistence of quality complaints and infrastructure disruptions underscores that the path from capable AI to dependably useful AI remains an engineering and operational challenge as much as a research one. As competition intensifies from alternatives like ChatGPT and Gemini in coding contexts, Anthropic's ability to stabilize infrastructure and close the gap between benchmark performance and real-world user satisfaction will be critical to retaining the developer trust it has cultivated.

Read original article →

Detailed Analysis

Don't Miss a Deploy