← Reddit

compression skills

Reddit · MrChurch2015 · May 28, 2026
Two compression skills were created to reduce token consumption in AI instruction files and system prompts, achieving 40-75% character reduction through caveman-compress technology and potentially saving hundreds of thousands of tokens. The system-prompt-trimmer and lean-project-instructions tools maintain performance while optimizing AI instruction files and system prompts.

Detailed Analysis

A member of the Claude AI community has developed and shared two token-compression tools designed to reduce the character and token overhead associated with Claude instruction files and system prompts. Drawing inspiration from the existing Caveman plugin and its associated caveman-compress library, the developer built a skill called "lean-project-instructions," which targets CLAUDE.md files and similar AI agent instruction documents, and a second skill called "system-prompt-trimmer," which applies comparable compression logic to system prompts and potentially user prompts as well. The tools work by "mutilating" — that is, aggressively stripping and compressing — verbose natural language instructions into a more token-efficient form while reportedly preserving functional fidelity.

The developer reports character reductions ranging from 40% to 75% in initial testing across their own agent instructions and CLAUDE.md files. At the upper end of that range, the savings are operationally significant: depending on the length and complexity of a project's instruction set, this can translate into hundreds of thousands of tokens saved across extended agentic sessions. The author emphasizes that their own testing demonstrated the compressed instructions continued to produce expected behavior, suggesting the compression is semantically conservative enough not to degrade model performance in practical use cases.

The significance of this tooling lies in the economics and architecture of large language model usage. Claude, like other frontier models, charges based on token consumption and operates within context window limits. As agentic workflows have grown more complex — with CLAUDE.md files, multi-step system prompts, and layered tool instructions becoming commonplace — the overhead from verbose natural language instructions has become a meaningful cost and performance variable. Projects running continuous or high-volume agentic loops are particularly exposed to this overhead, and compression at the instruction layer represents a relatively low-risk intervention compared to altering model behavior or reducing functional scope.

This development reflects a broader pattern within the Claude user community of grassroots tooling built around context optimization. The rise of CLAUDE.md as a standardized instruction convention — particularly within Claude's agentic coding environments — has created an ecosystem of complementary tools aimed at making that convention more efficient. The Caveman plugin that inspired this work represents one node in this ecosystem; the new compression skills extend that logic more directly into Anthropic's own instruction file formats. The community-driven nature of this work also underscores that optimization pressure around token efficiency is not being addressed solely by Anthropic at the model or API level, but is generating active engineering responses from power users operating at scale.

More broadly, the emergence of prompt and instruction compression as a recognized engineering discipline points to a structural tension in the current generation of LLM-based systems: the expressiveness that makes natural language instruction powerful also makes it token-expensive. Tools like caveman-compress and derivatives attempt to resolve this tension through heuristic or rule-based compression, trading some human readability for computational efficiency. As context windows expand and agentic deployments grow longer and more complex, the demand for such compression utilities is likely to increase, and community-built solutions like the ones shared here may eventually inform more formalized tooling within development environments or API ecosystems.

Read original article →