← Reddit

When a conversation uses the TaskCreate tool the entire conversation is thrown through cache write.

Reddit · TheTwistedTabby · June 3, 2026
Using the TaskCreate tool triggers a cache write operation for the entire conversation, causing the status line to display cache_creation_input_tokens in its color-coded indicator. This cache write occurs once per conversation but can become expensive when working with the full 1 million token context window. The context window data returned by Claude Code includes metrics for cache creation tokens, cache read tokens, and overall usage percentages.

Detailed Analysis

A Reddit user monitoring Claude Code's internal context window telemetry has identified a notable caching behavior: whenever the TaskCreate tool is invoked to establish a visible task list within a conversation, the entire conversation is routed through a cache write operation, registering non-zero `cache_creation_input_tokens` in the returned `context_window.current_usage` data structure. The user discovered this through a custom status line tool that visualizes context window consumption in real time, using color-coded indicators tied to specific token categories returned by Claude Code after each model call. The observation is grounded in concrete data — the `current_usage` object exposes four distinct token counters: `input_tokens`, `output_tokens`, `cache_creation_input_tokens`, and `cache_read_input_tokens` — and the cache creation counter lights up predictably at the moment of task list initialization.

The practical significance of this behavior scales directly with how deep into a conversation the TaskCreate invocation occurs. Claude Code supports a context window of up to one million tokens, and if a user is operating well into that capacity before triggering a task list, the cache write encompasses the entire accumulated context at that point. The user notes the operation appears to occur only once per conversation, suggesting the system caches the state at the moment of task creation rather than re-caching on subsequent task updates. This one-time cost is structurally logical — prompt caching systems typically store a snapshot of the prefix up to a certain point, allowing subsequent reads to be dramatically cheaper — but it represents a potentially significant compute event if triggered late in a dense, long-running session.

This observation connects to the broader architecture of how Anthropic implements prompt caching in its API. Anthropic's caching mechanism works by storing a hashed representation of a conversation prefix, enabling future turns to read from cache rather than reprocessing the full input. The `cache_read_input_tokens` value in the user's example — 35,616 tokens compared to just 1 input token — illustrates how effective cache reuse can be once established, dramatically reducing per-turn compute costs. The triggering of a new cache write at TaskCreate time suggests the tool invocation structurally alters the cacheable prefix in a way the system treats as a new canonical state worth persisting, possibly because task lists introduce persistent UI state that must be coherently referenced in all subsequent turns.

From a developer and power-user perspective, this finding has meaningful implications for session cost management in Claude Code. Users working with Anthropic's API directly or through Claude Code who make heavy use of the TaskCreate feature should be aware that initiating task lists late in large sessions will incur a one-time but potentially substantial cache-write cost. The transparency afforded by the `context_window` return structure — which is surfaced on every model call — is itself noteworthy, as it gives sophisticated users the instrumentation to observe and reason about otherwise opaque infrastructure behavior. The Reddit post represents the kind of empirical reverse-engineering of AI system behavior that has become increasingly common as developers build tooling around large language model APIs and gain granular visibility into their operational characteristics.

Read original article →