Detailed Analysis
A developer has identified and attempted to solve a recurring inefficiency in Claude Code's behavior when working with large codebases: the agent repeatedly reconstructs architectural understanding from scratch each time it is asked a repository-level question, leading to excessive token consumption and redundant file reading. The solution, an open-source MCP server called Provenant, interposes a structured memory layer between the agent and raw source files. Rather than allowing Claude Code to perform broad searches and open large files to rebuild its mental model of a codebase, Provenant pre-computes a compact representation consisting of attributed wiki pages, dependency graphs, file localization indices, source citations, and confidence tracking with asynchronous repair mechanisms for low-confidence entries.
The empirical results reported are notable in scale. Tested against SWE-bench Verified, a standard benchmark comprising 500 real GitHub issues across 12 repositories, Provenant improved correct file localization at 10 candidates (C@10) from 69.0% to 75.2%, a meaningful gain on a benchmark designed to reflect genuine software engineering difficulty. More striking is the token reduction figure: for Flask-related queries, retrieved context dropped from 69,044 tokens to 1,070 tokens, a 64.5× reduction. This suggests that the architectural memory layer is not merely summarizing poorly but is actually surfacing more precise, relevant information than the agent's default search-and-read behavior.
The MCP (Model Context Protocol) integration is architecturally significant. Anthropic's MCP standard, which Claude Code natively supports, allows external servers to provide structured tool interfaces to the agent. By implementing Provenant as an MCP server, the developer allows Claude Code to query architectural memory as a first-class tool call rather than patching prompts or modifying the agent's internals. The analogy the developer uses — giving the agent a map before asking it to walk through the entire city — accurately describes the design philosophy: pre-computed navigational structure replaces emergent, expensive exploration.
This work connects to a broader challenge in agentic AI development: context window economics. As coding agents like Claude Code are deployed on larger and more complex repositories, the cost of repeated context reconstruction becomes a practical bottleneck, both in terms of latency and API expense. The SWE-bench results suggest that targeted retrieval augmentation can simultaneously reduce cost and improve accuracy, which contradicts a naive assumption that more context always produces better results. The file localization improvement in particular implies that architectural summaries can be higher signal than raw file contents for orientation tasks.
The developer's public solicitation of criticism from heavy Claude Code users positions Provenant as an early-stage research artifact rather than a production tool, and the whitepaper accompanying the release suggests a desire for academic engagement alongside practical adoption. The open questions remaining — which repository-level queries still fail despite the memory layer, and how well confidence tracking handles rapidly evolving codebases — point to areas where the current architecture likely has meaningful gaps. Dependency staleness, monorepo complexity, and dynamically generated code are categories where static architectural memory could degrade quickly, and how Provenant's asynchronous repair mechanism handles these cases at scale will determine whether the approach generalizes beyond benchmark conditions.
Read original article →