MCP server that fact-checks AI bug diagnoses against AST evidence

Unravel is an MCP server that fact-checks AI coding agent bug diagnoses by running deterministic static analysis on actual code and verifying every structural claim before output. The system combines AST parsing, cross-file dataflow analysis, and an 11-phase verification protocol to catch confident but incorrect AI diagnoses, such as wrong line numbers or variables out of scope. A supporting Task Codex mechanism prevents context decay by logging verified findings with exact citations, allowing the agent to retrieve accurate code details across large codebases without re-reading previously examined files.

Detailed Analysis

Unravel represents a novel architectural approach to a well-documented failure mode in AI-assisted software development: the tendency of large language models to produce bug diagnoses that are syntactically convincing but factually ungrounded. Built as a Model Context Protocol (MCP) server, Unravel inserts itself as a deterministic intermediary layer between an AI coding agent and the developer, running static analysis via Tree-sitter's Abstract Syntax Tree parsing to extract verifiable structural facts — mutation chains, async boundaries, closure captures, and floating promises — before the agent ever begins reasoning. Crucially, no language model operates inside Unravel itself; the system is explicitly designed as evidence infrastructure and a verification harness, not an inference engine. The agent remains the reasoning component, but its claims are now subject to hard computational checks against actual source code before any diagnosis reaches the developer.

The system's central innovation is what its creator calls the "Sandwich Protocol," a three-layer verification architecture in which the bottom layer produces deterministic static evidence, the middle layer requires the agent to follow an eleven-phase structured reasoning protocol including the generation of three competing hypotheses with distinct causal mechanisms, and the top layer runs six verification checks against the real codebase before a diagnosis is released. Two gates fire before any claim-level checking even begins: one confirming that competing hypotheses were genuinely generated rather than skipped, and another confirming that the stated root cause contains a specific file-and-line citation rather than vague natural-language inference. Hard rejections on either condition terminate the diagnostic session outright. This design directly targets a known LLM failure pattern — confident narrative construction that bypasses the hypothesis-testing discipline a skilled engineer would apply — by making that discipline a precondition of output rather than a recommendation.

Several subsystems within Unravel address orthogonal problems in agent-assisted code analysis with notable engineering specificity. Cross-file dataflow tracing resolves imports and follows symbol origins through a module graph, enabling confirmed identification of cross-file race conditions with exact citations for every step in the state propagation chain. A knowledge graph component embeds hub nodes as 768-dimensional vectors using Gemini's embedding API, allowing symptom descriptions to route efficiently to the 6–12 most relevant files in large repositories rather than flooding the agent's context window. A self-improving pattern store, initialized with over twenty structural bug patterns mapped to CWE identifiers, updates pattern weights after every verified or rejected diagnosis, creating a lightweight reinforcement signal tied to real debugging outcomes in a specific codebase. A cross-modal visual routing component extends the same vector space to screenshots, allowing a broken UI image to be semantically matched to relevant source files — an approach that, while nascent, points toward richer multimodal grounding for debugging workflows.

The Task Codex subsystem addresses a separate but related problem: context decay in long-horizon code reading sessions. The developer's own testing with Claude on a multi-file codebase revealed measurable degradation in recall of early files by the time the agent reached later ones — a phenomenon Claude confirmed when asked directly. The Codex functions as a structured external memory system, allowing the agent to record precise citations and observations while code is freshly parsed and retrieve them accurately later, rather than reconstructing details from increasingly compressed summaries. This reframes the problem not as retrieval but as decay prevention, a distinction that matters architecturally because it prioritizes write-time fidelity over search-time accuracy. The system auto-seeds Codex entries from verified diagnoses, meaning the memory infrastructure grows passively as the verification pipeline operates.

Unravel's broader significance lies in its demonstration that deterministic static analysis and probabilistic LLM reasoning can be composed into a credible verification architecture without requiring the LLM itself to be more reliable. Rather than attempting to improve model faithfulness through prompting or fine-tuning, the system treats model output as inherently unverified and constructs a computational envelope that filters it against ground truth. This approach maps onto a wider trend in AI tooling — sometimes described as "LLMs as reasoning engines, formal systems as oracles" — that has gained traction across domains from theorem proving to contract analysis. As AI coding agents are deployed on increasingly complex production codebases, infrastructure that enforces epistemic discipline on agent outputs, rather than trusting developers to catch hallucinations manually, is likely to become a meaningful engineering requirement rather than an optional enhancement.

Read original article →

Detailed Analysis

Don't Miss a Deploy