Combine persistant global Memory- and Task- management into one uniform system

A software engineer developed a system combining global memory and task management through parallel markdown documentation that mirrors the codebase structure, enabling agents to retrieve context by construction rather than search. The approach uses "inputs" (scoped documentation subsets) to detect implementation scope and risks, followed by "outputs" that document planned changes and architecture before code implementation, allowing iteration on logic before deployment. While the system demonstrates benefits for multi-repo workflows through improved agent awareness and architectural planning, scalability concerns remain regarding token costs and system complexity.

Detailed Analysis

A software engineer working across a multi-repository codebase has developed a custom agentic workflow system that unifies persistent documentation memory with structured task management, publishing the approach on GitHub under the name `ai-context-system`. The core innovation is a parallel folder tree of markdown files that mirrors the actual directory structure of the codebase, such that any agent — whether GPT or Claude — can locate relevant documentation simply by knowing where the corresponding code lives. Rather than relying on retrieval-augmented generation (RAG) search or a traditional wiki, the system practices what the author calls "retrieval by construction": documentation is authored and updated as a byproduct of completing tasks, ensuring that context surfaced to the agent is always structurally co-located with the code it describes. The engineer reports that agents are independently and eagerly opening these files alongside code files in practice, suggesting the low-friction, guaranteed-relevance retrieval model aligns well with how large language models navigate context decisions.

The workflow is divided into discrete phases — inputs and outputs — that impose a structured review layer before any code is modified. During the input phase, an agent detects the scope of a given requirement and pulls a corresponding subtree of the documentation, flagging missing branches as indicators of incomplete scope and annotating likely change locations and collision risks without yet touching implementation. The output phase then requires the agent to draft concrete implementation changes and their corresponding documentation simultaneously, forcing architectural reasoning before execution and requiring traceability back to specific requirement IDs. This architecture deliberately delays code modification until the logic, contracts, and documentation have been reviewed and approved, addressing a common failure mode in agentic coding where planning diverges from code reality. The use of branched local documentation — which is merged into the global tree only after task approval — adds a form of version control to the memory layer itself, preventing incremental documentation drift or "poisoning" that the author identifies as a near-uncontrollable problem in opaque RAG-backed systems.

The approach intersects directly with developments Anthropic has been pursuing at the platform level. Anthropic's memory tool (`memory_20250818`), released as an API primitive, enables Claude agents to create, read, update, and delete persistent files for just-in-time retrieval across conversations and multi-agent orchestration scenarios — a structural parallel to what the Reddit author has built manually at the filesystem level. Similarly, Claude's Projects feature and knowledge base system in Claude Cowork are designed to provide selective, per-project memory isolation, preventing the kind of cross-contamination the author guards against through branched documentation. That a developer independently converged on many of the same architectural principles — persistent structured context, scope-bounded retrieval, session-spanning memory separate from ephemeral chat — reflects a broader consensus forming around the requirements for reliable long-horizon agentic work.

The principal concern the author raises is scaling cost: maintaining documentation parity across a large multi-repo system requires a non-trivial token budget for reading, creating, and updating files at each step, and it is not yet clear whether the quality gains justify the latency and cost penalties at scale. This tradeoff is well-recognized in the broader agentic systems community. Anthropic's own research on long-running Claude agents acknowledges that persistent memory primitives introduce overhead that must be managed through selective retrieval strategies rather than wholesale context loading. The author's branching model — where global docs are updated only post-approval — is a pragmatic mitigation, but the token cost of documentation maintenance embedded within the task loop remains an open variable. Security considerations are also relevant: Cisco's disclosure of a memory poisoning vulnerability in Claude Code (patched in v2.1.50) underscores that any system where agent-generated content feeds back into persistent memory requires careful control over what gets written and when, a risk the author's approval gate partially addresses.

The system described represents a thoughtful practitioner-level synthesis of memory architecture and workflow orchestration that is arriving at the same structural conclusions as industry-level tooling from a bottom-up direction. The emphasis on human-readable, co-located documentation as the memory substrate — rather than vector embeddings or opaque summaries — carries meaningful advantages for auditability, collaborative use, and agent self-orientation in complex codebases. As Anthropic and others continue formalizing persistent memory primitives, the patterns pioneered by developers in production multi-repo environments will likely inform the design of more standardized tooling. The central open question — whether documentation-as-memory can scale without becoming a bottleneck — is not merely a personal concern for this engineer but a foundational challenge for the agentic software development paradigm as a whole.

Read original article →

Detailed Analysis

Don't Miss a Deploy