My Claude dreams at night and remembers everything. Better than mempalace.

A developer created iai-mcp, a local memory daemon that captures Claude conversations and organizes them into three memory tiers to provide context across sessions without requiring explicit memory commands. After five months of daily use, the system has learned the developer's coding style, project structures, and preferences through normal conversation and achieves 99% verbatim recall with retrieval times under 100ms. The tool has been released as open source under an MIT license with all benchmark harnesses included for verification.

Detailed Analysis

A developer identified as CodeAbra has released iai-mcp, an open-source, MIT-licensed local daemon designed to solve one of the most frequently cited friction points in working with Claude: the absence of persistent memory across sessions. Built and refined over five months of daily personal use beginning in January 2026, the tool operates by capturing every conversation with Claude, organizing the captured data into three distinct memory tiers, and injecting relevant context back into new sessions automatically. The result is a system that, from the user's perspective, behaves as though Claude retains knowledge of prior interactions indefinitely, without requiring any manual prompting or copy-paste workflows.

The technical architecture reflects a deliberate emphasis on performance, privacy, and verifiability. iai-mcp runs neural embeddings locally rather than relying on external API calls, encrypts stored data at rest using AES-256, and performs memory consolidation passively during machine idle time. The developer reports verbatim recall rates above 99%, retrieval latency under 100 milliseconds, and a session-start token cost under 3,000 tokens — a meaningful constraint given that context-window overhead directly affects both cost and response quality. Critically, the project ships with benchmark harnesses, allowing independent users to reproduce and validate the performance claims rather than accepting them on faith.

What distinguishes iai-mcp from similar community experiments is the depth of its real-world validation. Five months of continuous daily use with Claude Code, Anthropic's agentic coding product, produced emergent behavior the developer did not explicitly engineer: the system learned coding style, project structures, and preferences through observed conversation rather than through deliberate "save this" commands. This passive accumulation of implicit preferences points toward a more sophisticated model of what AI memory can mean — not just recall of stated facts, but the gradual crystallization of working context that mirrors how human collaborators build shared understanding over time.

The release arrives during a period of intense industry activity around AI memory and personalization. Anthropic itself has been expanding Claude's native memory capabilities in its consumer products, and competing labs have made persistent memory a marquee feature. The emergence of community-built tools like iai-mcp indicates that official implementations have not yet fully satisfied power users, particularly developers running local or code-heavy workflows where session continuity is operationally significant. The open-source release transforms what was a private productivity tool into infrastructure that other developers can audit, extend, and adapt, potentially accelerating the broader ecosystem's understanding of effective memory architecture for large language model workflows.

The project also surfaces a recurring dynamic in the AI tooling space: foundational capabilities that major labs treat as roadmap items are frequently prototyped and shipped by individual developers working from personal frustration. The gap between what Anthropic offers natively and what a motivated developer built and battle-tested over five months is itself a signal about unmet demand. Whether iai-mcp's architecture anticipates or influences how Anthropic eventually handles persistent memory natively remains to be seen, but its technical specificity — tiered storage, local embeddings, encrypted persistence, transparent benchmarks — sets a concrete performance baseline against which official solutions will likely be compared.

Read original article →

Detailed Analysis

Don't Miss a Deploy