I built a Claude Code plugin that actually enforces your rules instead of hoping the model follows them

A developer created Writ, a Claude Code plugin that enforces coding rules through a retrieval engine and bash hooks rather than relying on prompts alone. The retrieval engine uses a Neo4j knowledge graph to identify relevant rules for each task, reducing context consumption from approximately 83,000 tokens to 1,600 tokens per query, while bash hooks prevent code execution that violates specified rules. The tool ships with 276 pre-configured rules and skills, automatic project linter integration for languages including PHP, JavaScript, and Python, and custom analyzers for security and performance issues.

Detailed Analysis

A developer has published an open-source tool called Writ, designed to address one of the most persistent complaints among heavy users of Claude Code: the model's tendency to selectively ignore user-defined rules and coding standards. The tool operates on two distinct layers. The first is a retrieval engine built on a Neo4j knowledge graph that uses a five-stage pipeline to surface only the rules relevant to a given task, reducing context load from approximately 83,000 tokens down to roughly 1,600 per query at a median latency of 0.338 milliseconds. The second is an enforcement layer composed of 30 bash scripts wired to Claude Code's hook system — PreToolUse, PostToolUse, and SessionEnd — that intercept and block tool calls before they execute, making rule compliance structural rather than advisory. The project ships with 276 rules and skills across 12 domains and includes 1,442 tests.

The significance of Writ lies in its architectural philosophy: it deliberately removes the AI model from the compliance decision entirely. Rather than prompting Claude to follow rules and hoping the instruction survives context drift or model discretion, Writ routes enforcement through the operating system layer via bash hooks. Concrete examples include blocking code generation until a plan and test skeletons have been approved, and preventing Claude from reporting passing tests without first executing static analysis tools. The system also auto-discovers and runs project linters — PHPStan, ESLint, ruff, cargo check, go vet — alongside custom analyzers targeting injection vulnerabilities, authentication flaws, cryptographic issues, and N+1 query patterns on every file write.

The tool speaks directly to a well-documented limitation of large language models used in agentic coding contexts: instruction following degrades under complexity, especially when rulebooks grow large and compete for finite context windows. The token-reduction mechanism Writ employs is a pragmatic acknowledgment that stuffing every rule into every prompt is both expensive and counterproductive. By using a graph-based retrieval system that surfaces dependency and conflict relationships between rules automatically, the author has operationalized a principle that the AI community has discussed theoretically — that relevant context, not maximal context, produces better outputs.

More broadly, Writ represents a class of tooling that is emerging in response to the gap between LLM capability and LLM reliability in production developer workflows. As AI coding assistants become embedded in professional development pipelines, the demand for deterministic guardrails — mechanisms that do not depend on model behavior to enforce correctness — is growing. This mirrors patterns seen in other high-stakes automation domains where probabilistic systems are paired with rule-based constraint layers. The fact that Claude Code exposes a hooks API capable of intercepting tool calls suggests Anthropic has anticipated this need, and third-party developers like the author of Writ are beginning to build the enforcement infrastructure that enterprise and security-conscious teams require before trusting AI agents with consequential codebases.

Read original article →

Detailed Analysis

Don't Miss a Deploy