troubleshooting and writing ansible configurations wrong all of a sudden this week

A longtime Claude user encountered recurring configuration errors in infrastructure as code repositories beginning this week after migrating agents from OpenClaw to Hermes. Claude generated erroneous configuration entries referencing a non-existent 'gpt-5-haiku' model while operating on Opus 4.7 at xhigh effort level. These issues had not previously occurred with Sonnet or earlier Opus releases during months of prior use.

Detailed Analysis

A longtime Claude Code subscriber reports a sudden degradation in the quality of Ansible infrastructure-as-code configurations generated by Claude, with the anomalies appearing to coincide with a broader reconfiguration of the user's multi-agent lab environment. The user, operating on a max-times-20 subscription since at least August of the prior year, describes reliable performance across prior model versions — including Sonnet and Opus going back to version 4.5 — before observing a sharp uptick in configuration errors requiring manual correction this week. Most strikingly, Claude apparently inserted references to a model named "gpt-5-haiku" into the generated configs, a model that does not exist and appears to be a hallucinated blend of OpenAI's GPT naming conventions and Anthropic's own "Haiku" model tier branding.

The timing of the degradation is the most diagnostically significant detail. The user explicitly notes that the errors began after transitioning several agents from an OpenAI-compatible setup to a different system, referencing "Hermes" as the target. This transition likely altered the broader context or prompt environment in which Claude Code was operating, potentially introducing conflicting model-naming schemas, mixed system prompts, or cross-contaminated configuration templates into the working context. The hallucination of "gpt-5-haiku" is a textbook example of context bleed — where exposure to OpenAI model naming conventions alongside Anthropic's own tiered naming (Haiku, Sonnet, Opus) caused the model to synthesize a plausible-sounding but entirely fictional identifier. This is not a random error; it reflects the model drawing on two co-present naming frameworks and producing a conflated output.

The user's attempted mitigation — escalating from medium to "xhigh" effort and switching to Opus 4.7 — reflects a reasonable but incomplete diagnostic approach. Increasing computational effort addresses reasoning depth but does not resolve problems rooted in corrupted or ambiguous context. If the surrounding agent scaffolding or system-level prompts contain residual references to OpenAI model names or mismatched configuration schemas from the migration, higher effort simply applies more processing power to flawed inputs. The underlying Ansible-specific troubleshooting literature points to similar dynamics in human-authored environments: errors that appear suddenly and systematically often trace back to environmental changes — config file precedence shifts, version mismatches, or inventory alterations — rather than to the tool itself degrading in isolation.

This incident illustrates a broader challenge at the frontier of agentic AI deployment in homelab and infrastructure-as-code contexts. As users increasingly compose multi-model, multi-agent environments — mixing Anthropic, OpenAI, and open-weight systems — the surfaces for context contamination multiply. Claude is not operating in a vacuum; it is operating within whatever scaffolding, memory, and system prompt architecture the user has constructed. When that architecture undergoes rapid change, as it did here during the agent migration, the model's outputs can become unreliable in ways that superficially resemble model regression but actually reflect environmental instability. The "gpt-5-haiku" artifact is particularly instructive because it demonstrates that the failure mode is not silent — it is detectable precisely because the hallucinated artifact is verifiably nonsensical, which is the best-case scenario for catching context-contamination errors before they propagate silently through production infrastructure.

The practical resolution likely lies not in further tuning model parameters but in auditing the system prompts, tool contexts, and memory stores that Claude Code is drawing upon post-migration. Isolating Claude Code sessions from any residual OpenAI-adjacent context, validating that no cross-model configuration templates are being fed as input, and running Ansible's own syntax-check and verbose logging passes on the outputs would collectively help distinguish between model-side error and environment-side contamination. The broader takeaway for the AI infrastructure community is that multi-agent composability, while powerful, demands the same disciplined environment hygiene that infrastructure engineers have long applied to their toolchains — version pinning, change logging, and blast-radius isolation — now extended to the AI layer itself.

Read original article →

Detailed Analysis

Don't Miss a Deploy