RTFM v0.4 — MCP retrieval server that cuts vault context by 90% (Obsidian + Claude Code)

Problem: Karpathy-style LLM wikis inject everything into context. On a 1,700-file vault, that's your entire quota in minutes. I built an MCP server that does retrieval instead of scanning. **How it works with Claude Code:** The agent calls

Detailed Analysis

RTFM v0.4 represents a targeted engineering response to one of the most persistent practical limitations of integrating large language models with personal knowledge management systems: context window exhaustion. The tool, an open-source MCP (Model Context Protocol) server installable via `pip install rtfm-ai[mcp]`, addresses the pattern popularized by Andrej Karpathy and others of maintaining LLM-readable wikis or vaults — a practice that becomes self-defeating at scale, since injecting all documents from a 1,700-file Obsidian vault into an AI agent's context can consume the entire available token budget within minutes. Instead of scanning, RTFM performs retrieval: the agent calls `rtfm_search()` and receives only a handful of scored results (~300 tokens), then optionally calls `rtfm_expand()` to retrieve specific sections. This progressive disclosure architecture means context grows incrementally and intentionally rather than frontloaded and indiscriminately.

The v0.4 update centers on native Obsidian vault integration, adding automatic corpus mapping that translates folder hierarchies into searchable corpora, resolution of Obsidian's `[[wikilink]]` syntax into a traversable knowledge graph with centrality ranking, and auto-generation of `_rtfm/` navigation files that remain human-readable within Obsidian itself. The parser suite — covering Markdown, Python AST, LaTeX, PDF, YAML, JSON, shell scripts, and more — reflects an ambition to index heterogeneous knowledge bases rather than code repositories alone. The extensibility claim of adding new format support in approximately 50 lines of Python positions RTFM as a platform rather than a fixed tool. Measured performance gains reported by the developer on real repositories are substantial: 51% cost reduction, 61% fewer tokens consumed, and 16% shorter task duration compared to standard grep-based navigation approaches.

The broader significance of RTFM lies in its embodiment of a design philosophy increasingly central to production AI agent deployment: surgical context delivery over brute-force context injection. As AI coding assistants like Claude Code, Cursor, and OpenAI's Codex become standard development tools, the inefficiency of naive document ingestion translates directly into dollar costs and latency penalties at scale. The MCP standard, originally introduced by Anthropic as an open protocol for connecting AI models to external data sources and tools, provides the architectural foundation that makes RTFM's retrieval-on-demand approach possible — agents can call structured tools rather than receiving all information passively at session start. RTFM's knowledge graph construction from wikilinks is particularly notable because it moves beyond flat document retrieval toward exploiting the relational structure users have already encoded in their notes.

RTFM's emergence also reflects a maturation in how developers think about the relationship between personal knowledge management and AI assistance. Early integrations treated vaults as context dumps; newer approaches like RTFM treat them as queryable databases with graph topology. The centrality ranking derived from wikilink structure borrows an insight from PageRank-style algorithms — documents referenced by many other documents are structurally more important — and surfaces that signal to the AI agent without requiring the agent to traverse the graph itself. With approximately 6,600 recorded visitors since its January 2025 release and MIT licensing ensuring open adoption, RTFM occupies a niche that is likely to see significant competition as the MCP ecosystem matures and developers increasingly demand efficient, cost-aware agent tooling rather than token-intensive integrations.

Read original article →

Detailed Analysis

Don't Miss a Deploy