[BUG] Claude Code “Context Pollution” & Massive Token Exhaustion on WSL due to CRLF Churn

Claude Code misidentifies files with CRLF/LF line-ending differences as modified in Windows/WSL environments, causing the git diff output to rapidly exhaust the context window token limit. Once cached, this whitespace-only diff persists across subsequent interactions, preventing normal functionality and effectively degrading session utility. Normalizing line endings through a .gitattributes file and git renormalize command resolves the false modifications.

Detailed Analysis

A user working in a Windows Subsystem for Linux (WSL) environment has documented a significant bug in Anthropic's Claude Code tool wherein mixed line-ending conventions between Windows (CRLF) and Linux (LF) systems trigger runaway token consumption and persistent context degradation. The issue manifests when Claude Code executes a `git diff` operation on a repository that has not been normalized for cross-platform line endings. Git interprets every file with line-ending discrepancies as "modified," even when no substantive code changes have been made, and Claude Code proceeds to ingest the entirety of that diff output into its context window. The user reports that this process consumed approximately 84% of the available context window within roughly one minute across a branch containing 69 affected files — a volume of whitespace-only diff data sufficient to effectively saturate the session.

The bug manifests as two compounding problems. The first is token exhaustion: the sheer volume of CRLF-to-LF diff output rapidly fills the context window, leaving little room for meaningful work. The second, arguably more disruptive, is what the user terms "context pollution" — once the bloated diff data is cached into the session context, Claude Code continues to reference it in every subsequent interaction, including simple informational queries that would otherwise require minimal context. This effectively renders the session unusable without a restart, creating what the user characterizes as a "bricked" context state. The self-reinforcing nature of the problem means that the damage persists beyond the initial `git diff` call, making recovery difficult without manually resetting the session.

The user-identified fix involves adding a `.gitattributes` file to the repository root with the directive `* text=auto`, followed by running `git add --renormalize .` to retroactively normalize line endings across the repository. This prevents Git from flagging unmodified files as changed due to encoding differences, which eliminates the source of the spurious diff output. While this workaround addresses the root cause at the repository level, the user correctly notes that Claude Code itself lacks any internal safeguard — such as diff truncation logic or whitespace-only change filtering — that would prevent this class of problem from affecting users who have not yet configured their repositories for cross-platform compatibility.

The incident highlights a broader architectural concern about how agentic coding tools handle tool output ingestion. Claude Code, like similar AI-integrated development environments, relies on programmatic tool calls — including shell commands like `git diff` — to gather context about the state of a codebase. When those tools return unexpectedly large or low-signal outputs, the system currently lacks mechanisms to triage or compress that data before committing it to the context window. Implementing heuristics to detect and truncate whitespace-only or encoding-only diffs would represent a meaningful defensive measure, particularly as Claude Code sees broader adoption in enterprise and cross-platform development environments where CRLF/LF mismatches are common.

The user's frustration with Anthropic's support response — which reportedly attributed the problem to general "user usage" rather than acknowledging a tool-side deficiency — points to a secondary issue in how AI tooling companies handle bug reports from power users. The report is technically precise and includes self-diagnostics performed by Claude Code itself confirming the token consumption pattern, which makes a purely user-side explanation difficult to sustain. As agentic AI development tools mature and penetrate more complex, heterogeneous development environments, the expectation for robust error handling, graceful degradation, and responsive technical support will only intensify.

Read original article →

Detailed Analysis

Don't Miss a Deploy