Is Claude Incapable of Following Directions?

A user documented repeated attempts to configure Claude with custom instructions across multiple settings—including text documents, project instructions, preferences, and a 3700-word skill—to prevent AI-like writing patterns, but found Claude ignored all directives. Despite Claude reportedly checking the skill, the model continued producing text characterized as obvious AI-isms and defaulting to established writing patterns. The user expressed concern that Claude's inability to follow even its own created instructions raises questions about the model's reliability for more complex tasks.

Detailed Analysis

A user on the r/ClaudeAI subreddit has published a pointed critique of Claude's instruction-following behavior, arguing that the model consistently ignores custom directives regardless of where those directives are placed — including project instructions, preferences, and the platform's dedicated "skills" feature. The user's central frustration is not merely that Claude fails to follow external instructions, but that it fails to follow instructions it itself generated: the user had Claude draft a detailed 3,700-word skill document establishing writing norms, including explicit prohibitions against AI-typical phrasing patterns, and found that Claude proceeded to violate those norms anyway. The specific failure mode described is Claude defaulting to what the user characterizes as generic, recognizable AI prose — the kind associated with earlier large language models — despite clear, negatively-framed directives such as "NEVER DO THIS."

The complaint surfaces a technically significant distinction between a model's apparent comprehension of instructions and its actual behavioral compliance with them. Claude demonstrably read the skill document — the user confirms it checks the skill during each session — yet continued producing outputs inconsistent with the stated constraints. This gap between instruction acknowledgment and instruction adherence points to a known challenge in large language model alignment: models trained on vast corpora of text develop strong stylistic priors that can resist override even when explicit system-level or context-level instructions attempt to suppress them. The user's metaphor of an "overbaked LORA" reflects a lay intuition about this phenomenon — that fine-tuning or instruction-tuning can calcify certain output patterns in ways that are difficult to surface-override at inference time.

The frustration also highlights a product design tension inherent in Anthropic's layered instruction architecture. Claude offers multiple tiers for user customization — preferences, project instructions, skills — which implicitly promises users meaningful behavioral control. When those tiers fail to produce the expected result, users experience not just a technical failure but a broken contract. The user's escalating experimentation across all available customization surfaces, each yielding the same outcome, suggests the failure is not one of misconfiguration but of the model's underlying compliance ceiling. If the system is architected to present customization as robust while the model's behavioral defaults remain dominant, the gap between expectation and output becomes a trust problem, not merely a usability one.

This critique connects to a broader and accelerating discourse around AI instruction-following reliability. As AI assistants are increasingly embedded in professional and creative workflows, users are demanding deterministic or near-deterministic behavioral compliance — not probabilistic adherence. The user explicitly ties their distrust of Claude's stylistic compliance to distrust of its reliability for higher-stakes tasks like code generation, articulating a reasonable inference: if a model cannot consistently suppress a surface-level stylistic habit that is explicitly flagged, confidence in its handling of complex, multi-constraint tasks is difficult to sustain. Competitors like Google's Gemini are invoked not on the basis of capability but on perceived instruction-following discipline, indicating that compliance and controllability are becoming primary competitive differentiators in the consumer AI market, potentially rivaling raw output quality as selection criteria.

Anthropic faces a product and research challenge that this post crystallizes with unusual clarity. The company has invested substantially in Constitutional AI and model-level alignment frameworks, but those frameworks are primarily oriented toward safety and ethical constraint rather than user-defined stylistic or behavioral customization. The result may be a model that is highly resistant to producing harmful content but comparably resistant to producing content that departs from its trained stylistic defaults — even when users explicitly request such departures. Closing this gap will likely require either advances in inference-time instruction weighting, more granular fine-tuning mechanisms exposed to end users, or architectural changes that give user-specified constraints genuine priority over learned priors, rather than treating them as soft suggestions competing against a heavily weighted default distribution.

Read original article →

Detailed Analysis

Don't Miss a Deploy