Made claude code warn you, time before it hits usage to transfer the pending work, all dynamically

A developer created agent-baton, a tool that uses Anthropic's usage API to alert Claude Code when approaching rate limits through three hooks that monitor usage at different stages. When usage reaches 91%, the tool prompts users with options to continue the task, write a handoff document, or switch to lightweight mode, preventing silent rate limit failures during work. The tool can be installed via npm and supports handoffs to other AI systems like Cursor or Gemini.

Detailed Analysis

A developer identified as codeprakhar25 has released an open-source tool called **agent-baton** that addresses a significant usability gap in Anthropic's Claude Code: the absence of proactive usage limit warnings during active coding sessions. The tool hooks directly into Claude Code's existing hook architecture — specifically the `SessionStart`, `UserPromptSubmit`, and `PreToolUse` lifecycle events — and cross-references Anthropic's usage API to give the model real-time awareness of its own consumption. Without such a tool, Claude Code continues executing tasks until it abruptly halts at a rate limit boundary, leaving work in an incomplete and potentially destabilizing state mid-refactor or mid-execution.

The technical design of agent-baton reflects careful attention to API efficiency and practical experience with rate-limit failure patterns. The `UserPromptSubmit` hook employs a TTL-aware caching strategy that scales check frequency with proximity to the limit — polling every 15 minutes under normal usage and every minute as the threshold approaches. The `PreToolUse` hook is the most operationally critical component, intercepting execution within one or two tool calls of the danger zone before the model commits to work it cannot complete. When a configurable warning threshold is crossed — the article cites 91% as an example — the tool surfaces an interactive prompt via Claude Code's native `AskUserQuestion` interface, offering the user three structured choices: continue, generate a handoff document, or switch to a lightweight operating mode.

The handoff functionality represents the most sophisticated aspect of the tool. Agent-baton can write a structured markdown document summarizing in-progress work and pass it programmatically to alternative AI coding environments including Cursor, OpenAI Codex, and Google Gemini. This capability reframes the rate-limit problem not merely as an interruption but as an orchestration challenge — one where continuity of work across different AI systems becomes a solvable engineering problem rather than a hard stop. The architecture implicitly acknowledges that no single AI coding assistant operates in isolation, and that complex development tasks increasingly span session boundaries and potentially agent boundaries.

The existence and rapid adoption of tools like agent-baton points to a broader maturation dynamic in the agentic AI tooling ecosystem. As AI coding assistants like Claude Code are increasingly deployed on long-running, multi-step tasks — the kind where a 40-minute refactor is routine rather than exceptional — the failure modes of those systems become proportionally more costly. The gap between what the UI exposes to a human user (visible usage bars) and what the model itself can perceive represents a systemic design tension in current agentic architectures. Developers are filling that gap with middleware instrumentation, essentially building observability layers for AI agents the same way the industry built APM tooling for distributed software systems a decade earlier.

Anthropic's decision to expose both a usage API and a hook system within Claude Code creates a surface area that the developer community is now actively exploiting to build production-grade tooling. Agent-baton is one early example of a class of tools that treat AI coding assistants as infrastructure to be managed rather than simply as interactive applications. As agentic workflows grow longer and more autonomous, demand for this category of meta-tooling — covering session management, graceful degradation, cross-agent handoffs, and resource awareness — is likely to accelerate well ahead of first-party support from AI providers themselves.

Read original article →

Detailed Analysis

Don't Miss a Deploy