Built an MCP Claude Connector for SEC filings after I nuked through my Claude usage limit

An engineer built an MCP Claude Connector to reduce token consumption when analyzing SEC filings by creating a navigation map that splits documents into logical sections instead of dumping entire filings into context. The tool generates a table of contents allowing agents to identify and fetch only relevant sections, each with a direct link to the original EDGAR filing, reducing token usage by approximately 85% compared to raw retrieval. The system supports over 6,000 US public companies' 10-K and 10-Q filings, works across multiple AI models, and is available for free.

Detailed Analysis

A developer building financial research workflows with Claude's API constructed a custom Model Context Protocol (MCP) connector for SEC filings after repeatedly exhausting Claude's weekly usage limits by feeding entire 10-K filings into context windows. The core engineering insight driving the project was straightforward: large-cap 10-K filings routinely exceed 80,000 tokens, and retrieving a full document to answer a question residing in a few paragraphs is both economically wasteful and technically counterproductive. The solution, published at alphacreek.ai, centers on a navigation-map approach that parses SEC filings into logical sections based on formatting, generates a table of contents, and allows the AI agent to selectively fetch only the relevant nodes rather than ingesting the full document. The developer claims approximately 85% token reduction compared to raw retrieval, with coverage across 6,000+ U.S. public companies and support for 10-K and 10-Q filing types, with 8-K and earnings transcript support in development.

The technical architecture addresses two distinct failure modes common in naive LLM-over-documents workflows. The first is context bloat: flooding a model with irrelevant text not only increases cost but degrades answer quality, as signal-to-noise ratios drop and model attention disperses across irrelevant material. The second is citation opacity, a particularly serious problem in financial research where a hallucinated revenue figure or misattributed footnote can have meaningful downstream consequences. The connector resolves the citation problem by attaching a `reader_url` to each retrieved chunk, linking directly to the corresponding passage in the original EDGAR HTML filing. This preserves a verifiable audit trail, a capability conspicuously absent from simpler approaches that return extracted text without provenance metadata.

The project enters a moderately active ecosystem. Several open-source and commercial SEC MCP servers already exist, including SEC-MCP, EdgarTools MCP, and edgar.tools, which collectively offer features such as real-time filing streams, XBRL financial data parsing, visual analysis of charts and tables, and multi-client integration across Claude Desktop, Cursor, and Cline. What distinguishes the alphacreek.ai approach is its explicit emphasis on hierarchical document navigation as the primary retrieval strategy rather than semantic search or full-document ingestion, a design choice that prioritizes structural fidelity to the filing's own organizational logic over embedding-based similarity. Whether this offers material advantages over vector-search retrieval methods in practice remains an open empirical question, but for highly structured regulatory documents like 10-Ks — which follow mandated section formats — document-structure-aware navigation is a defensible and potentially superior approach.

The project illustrates a broader pattern in the Claude and LLM developer ecosystem: practitioners hitting context and cost limits are increasingly building infrastructure layers that mediate between raw data sources and model context, rather than simply expanding context windows or upgrading API tiers. This reflects a maturing understanding that token efficiency is not merely a cost consideration but a quality one — excess context degrades model performance in ways that cannot be compensated by model capability alone. The MCP standard, which Anthropic introduced to provide a consistent protocol for connecting AI models to external tools and data sources, is clearly functioning as intended here: it lowered the barrier to building a production-grade connector sufficiently that a single developer resolved a personal workflow problem and published a general-purpose tool within a short iteration cycle. As the SEC filing MCP ecosystem grows more crowded, differentiation will likely hinge on depth of financial data parsing, real-time ingestion latency, and robustness of citation infrastructure — all dimensions the alphacreek.ai project has at least partially addressed in its initial release.

Read original article →

Detailed Analysis

Don't Miss a Deploy