← Claude Docs

Configure auto mode - Claude Code Docs

Claude Docs · April 22, 2026
Auto mode allows Claude Code to run without permission prompts by routing tool calls through a classifier that blocks irreversible or destructive actions outside the trusted environment. Organizations configure this classifier through the autoMode settings block, specifying trusted repositories, cloud buckets, and domains so routine internal operations are not blocked. The configuration can be defined across multiple scopes and includes fields for defining trusted infrastructure (environment), blocking rules (soft_deny), and exceptions to those rules (allow).

Detailed Analysis

Claude Code's auto mode, launched as a research preview on March 24, 2026, introduces a classifier-driven permission system that allows the tool to execute actions without interrupting developers for approval on every step. Rather than halting execution to prompt users, the system routes each tool call through an AI classifier that evaluates whether the action is irreversible, destructive, or directed outside the trusted environment. The classifier is notably "reasoning-blind," meaning it evaluates only the user's messages and tool calls themselves — not Claude's internal chain of thought — and escalates to human prompts only when repeated blocks occur. Currently available on Team plans with Enterprise and API availability forthcoming, auto mode represents Anthropic's attempt to make agentic coding workflows genuinely hands-free while preserving a meaningful safety boundary.

The central configuration mechanism is the `autoMode.environment` field, which functions as the classifier's definition of what constitutes "internal" versus "external" infrastructure. Out of the box, the classifier trusts only the active working directory and the current repository's configured remotes, meaning common developer actions like pushing to a company GitHub organization or writing to a shared cloud storage bucket are blocked by default. Operators extend this by providing natural-language prose descriptions of trusted infrastructure — source control organizations, cloud buckets, internal domains, and key services like CI systems and artifact registries — rather than regex patterns or formal access control lists. This design choice, favoring human-readable descriptions over machine syntax, reflects an architectural bet that a language model classifier can interpret intent from descriptive text more reliably than it can be exhaustively configured through pattern matching.

The configuration scope hierarchy reveals deliberate security architecture. The classifier reads `autoMode` settings from personal user settings, per-project local settings (which are gitignored), organization-managed settings, and inline flags — but critically, it does not read `autoMode` from shared project settings in `.claude/settings.json`. This prevents a checked-in repository from injecting its own allow rules, eliminating a supply-chain attack vector where a malicious or compromised repo could silently expand the classifier's trust surface. Entries across scopes are additive: developers can extend the environment list and add allow rules, but cannot remove entries pushed down through managed settings. Notably, a developer-added `allow` entry can override an organization-level `soft_deny`, meaning the policy boundary is not cryptographically hard — a design trade-off that prioritizes flexibility for legitimate use cases over strict hierarchical enforcement.

The distinction between `autoMode.soft_deny`, `autoMode.allow`, and `permissions.deny` reflects a layered defense model. Soft deny rules shape the classifier's built-in judgment about what to block, while allow rules act as exceptions within the classifier's decision process. Hard blocks, however, bypass the classifier entirely: `permissions.deny` runs before any classifier evaluation and cannot be overridden by allow rules, providing a guaranteed floor for high-stakes restrictions. This architecture parallels established security design principles — a probabilistic AI layer for nuanced judgment, backed by deterministic rules for non-negotiable constraints. The instruction that CLAUDE.md content influences the classifier simultaneously with Claude itself is also significant: a project-level instruction like "never force push" steers both the model's behavior and the autonomous permission system from a single source of truth.

The broader significance of auto mode's configuration system lies in what it reveals about the evolving relationship between AI coding agents and organizational trust models. Anthropic is effectively asking enterprises to formalize and articulate their infrastructure boundaries in natural language — a process that most organizations have never done explicitly. The recommended rollout strategy (start with source control and key services, then add domains and buckets as blocks surface) mirrors how organizations incrementally define firewall rules or IAM policies, but translated into a form a language model can operationalize. As AI agents take on longer-horizon tasks with more tool calls, the ability to configure trusted zones without requiring per-action human approval becomes a foundational capability rather than a convenience feature — and the design choices made in Claude Code's auto mode will likely influence how the broader industry approaches agentic permission systems.

Read original article →