claude gets worse the longer the chat goes, and my fix

Claude's performance degrades during extended chat sessions, with the model forgetting earlier information and providing generic responses after approximately two hours. A user resolved this by maintaining project context, decisions, and notes in markdown files accessible through Claude Code and MCP rather than relying on a single conversation thread. This approach preserves response quality without requiring re-explanation when starting fresh chats.

Detailed Analysis

A Reddit user on r/ClaudeAI has documented a widely recognized behavioral pattern in Claude: performance degradation over the course of extended conversations. The author describes a clear temporal arc — sharp, contextually aware responses in the first hour of a session, followed by progressive deterioration in the second hour, manifesting as forgotten earlier statements, contradictions of previously agreed-upon decisions, and increasingly generic outputs. The post has generated community interest not just for naming the problem but for proposing a practical architectural workaround that sidesteps the issue rather than waiting for it to be solved at the model level.

The root cause of this degradation is the context window, the fixed-length buffer of tokens that large language models use to process conversation history. As a chat grows longer, earlier portions of the conversation are either compressed, summarized, or effectively deprioritized as the model attempts to fit everything into its available attention capacity. Claude, like other frontier models, does not maintain a persistent memory between turns in any traditional computational sense — it re-reads the full conversation on each response generation, meaning that very long threads create real retrieval and attention challenges. The author's observation that the model begins "forgetting" early decisions is a practical manifestation of this architectural constraint, not a bug in the conventional software sense.

The solution the author developed centers on externalizing persistent context from the chat thread entirely. By maintaining project state, decisions, and notes as plain markdown files — and using Claude Code alongside the Model Context Protocol (MCP) to make those files accessible to any new chat session — the author effectively decouples conversational memory from the chat window itself. MCP, Anthropic's open standard for connecting AI models to external data sources and tools, is central to this architecture: it allows a freshly initiated conversation to ingest structured context from external files rather than relying on the degraded signal buried deep in a long thread. This transforms what would otherwise be a costly reset into a seamless continuation.

This approach reflects a broader trend in the AI developer community toward treating large language models as stateless inference engines that require external scaffolding for persistent, coherent long-term work. Tools like memory layers, retrieval-augmented generation, and agent frameworks all address the same fundamental limitation from different angles. The author's solution — which they have partially productized into a tool called Taproot — represents a pragmatic, low-overhead implementation of this philosophy using widely available components. The willingness to share the markdown-plus-MCP wiring openly also signals an emerging culture of workaround knowledge-sharing among power users who are building serious workflows on top of Claude before native long-context memory solutions mature.

The post lands at a moment when Anthropic and competitors are actively expanding context window sizes — Claude's context window has grown substantially in recent model iterations — yet the community response suggests that raw context length alone does not fully resolve quality degradation in very long sessions. Attention diffusion and the model's ability to maintain coherent priority hierarchies across hundreds of thousands of tokens remain active research and engineering challenges. The author's architectural pattern, separating ephemeral conversation from durable project state, may prove to be a durable design principle even as underlying model capabilities continue to improve.

Read original article →

Detailed Analysis

Don't Miss a Deploy