Detailed Analysis
A developer tool called **crag** has emerged to address a significant gap in AI-assisted development workflows: the disconnect between manually written AI context files and the actual enforcement rules baked into a project's continuous integration infrastructure. Available via `npx @whitehatd/crag`, the tool scans a repository's CI workflows, package manifests, and directory structure to produce a `governance.md` file that accurately reflects what the project's build pipeline actually enforces — lint rules, test frameworks, quality gates, and build steps — rather than what a developer remembers to document by hand.
The core value proposition of crag is its compilation model. Rather than treating CLAUDE.md as a standalone artifact, it treats `governance.md` as a single source of truth that can then be compiled into 12 different AI tool configuration formats simultaneously, including formats for Cursor, GitHub Copilot, and others. This architecture directly solves the fragmentation problem that teams face when they use multiple AI coding assistants: previously, maintaining accurate context files across tools required either redundant manual effort or accepting inconsistency. The tool accomplishes this without invoking a large language model — it is entirely deterministic and runs in approximately 500 milliseconds, making it viable as a step in automated pipelines.
The empirical findings shared alongside the tool's release underscore just how widespread the problem is. An analysis of 50 prominent open-source repositories found that 9 out of 13 top projects had zero AI configuration at all, and Grafana's existing CLAUDE.md contained only a single line despite crag identifying 67 distinct quality gates in the project's CI configuration. This data points to a systemic issue: even sophisticated engineering organizations have not yet developed practices for keeping AI context files current and comprehensive. The gap matters because an AI coding assistant operating without accurate project context is more likely to generate code that fails linting, breaks tests, or violates architectural conventions — wasting developer time on review cycles that automation should have caught.
Crag's approach sits in productive tension with Anthropic's own recommended workflow for CLAUDE.md generation. Claude Code's built-in `/init` command similarly analyzes a codebase to auto-generate a starter CLAUDE.md covering the tech stack, project structure, and build/test commands, with the expectation that developers will then refine the output manually. The distinction is that `/init` is a one-time bootstrapping step requiring human curation afterward, while crag frames itself as a repeatable, deterministic extraction pipeline — closer to infrastructure-as-code than to documentation. Where `/init` leverages Claude's own reasoning to interpret a codebase, crag's LLM-free approach offers reproducibility and auditability that may appeal to teams with strict compliance requirements.
Taken together, the tool reflects a broader maturation trend in how engineering teams think about AI developer tooling. The first wave of AI code assistant adoption was largely ad hoc — developers used tools opportunistically without formalizing the context those tools needed to be effective. A second wave is now emerging that treats AI configuration as a first-class engineering artifact: versioned, generated from authoritative sources, and kept in sync with the actual project state. Crag is an early example of tooling built specifically for this second wave, and its multi-tool compilation model anticipates an environment where teams routinely run several AI assistants in parallel and need consistent behavior across all of them.
Read original article →