Scale to many tools with tool search - Claude Code Docs

Tool search enables agents to work with hundreds or thousands of tools by dynamically discovering and loading only the tools needed on demand, rather than loading all tool definitions into the context window upfront. This approach addresses context efficiency and tool selection accuracy challenges that arise when scaling tool libraries beyond 30-50 tools. Tool search is enabled by default, uses a search mechanism that matches queries against tool names and descriptions to return the 3-5 most relevant tools, and includes configuration options through the ENABLE_TOOL_SEARCH environment variable.

Detailed Analysis

Anthropic's Claude Code platform has introduced a tool search capability designed to address a fundamental scalability constraint in agentic AI systems: the inability to efficiently operate across large libraries of available tools. The feature allows Claude-powered agents to dynamically discover and load relevant tools on demand rather than ingesting all tool definitions upfront, enabling catalogs of up to 10,000 tools without overwhelming the model's context window. When tool search is active, the agent receives only a summary of available tools and issues targeted searches to retrieve the three to five most relevant definitions when a required capability is not already in context. Those retrieved tools then persist across subsequent turns, and if the SDK compacts earlier conversation history, the agent simply re-searches as needed.

The technical motivation for this architecture reflects two well-documented failure modes in large-tool deployments. First, context window consumption: tool definitions are verbose, and a set of just 50 tools can consume between 10,000 and 20,000 tokens, displacing space that could otherwise hold task-relevant information or intermediate reasoning. Second, and more consequentially, tool selection accuracy degrades meaningfully when more than 30 to 50 tools are simultaneously visible to the model. By keeping only the most contextually relevant tools in-window at any given moment, the system maintains higher selection fidelity while dramatically expanding the total available capability surface. Anthropic acknowledges the tradeoff explicitly: tool search adds one extra network round-trip the first time a tool is discovered, but for large tool sets this overhead is offset by reduced context pressure on every subsequent turn.

The configuration system is notably granular, offering five distinct behavioral modes ranging from always-on to always-off, with two adaptive modes — `auto` and `auto:N` — that trigger tool search only when the combined token footprint of all tool definitions crosses a configurable percentage threshold of the context window. This graduated approach lets developers tune the tradeoff between search latency and context efficiency based on their specific use case, avoiding unnecessary round-trips for small tool sets while automatically activating deferred loading as tool libraries grow. Notably, tool search is disabled by default on Vertex AI and when routing through third-party proxy endpoints, since those environments do not reliably support the `tool_reference` block mechanism the feature depends on — a meaningful caveat for enterprise deployments that route API traffic through intermediary infrastructure.

The announcement situates tool search within the broader trajectory of agentic AI development, where systems are increasingly expected to interface with entire organizational software ecosystems — Slack, GitHub, Jira, CRM platforms, internal databases — rather than a handful of curated functions. The 10,000-tool ceiling and the restriction of support to Claude Sonnet 4, Claude Opus 4, and later models (explicitly excluding Haiku) signals that this is positioned as infrastructure for serious production deployments rather than lightweight integrations. Anthropic's guidance on optimizing tool discovery — using descriptive, keyword-rich tool names and descriptions, and injecting system prompt summaries of available tool categories — reflects how much of the system's effectiveness depends on the quality of tooling metadata rather than model capability alone. As agentic frameworks mature, the ability to navigate vast, heterogeneous tool ecosystems with minimal context overhead is becoming a core architectural requirement, and this feature represents Anthropic's direct response to that emerging standard.

Read original article →

Detailed Analysis

Don't Miss a Deploy