← Claude Docs

Security - Claude Code Docs

Claude Docs · April 28, 2026
Claude Code implements security at its core through a permission-based architecture requiring explicit user approval for file edits, command execution, and other sensitive operations. Built-in protections include sandboxed bash commands with filesystem and network isolation, write access restrictions limited to project directories, and defenses against prompt injection attacks. Privacy safeguards encompass limited data retention, restricted user data access, and for cloud execution, isolated virtual machines with network controls and comprehensive audit logging.

Detailed Analysis

Claude Code's security architecture centers on a permission-based design philosophy that prioritizes explicit human approval over autonomous action, establishing layered safeguards across every tier of the development workflow. By defaulting to read-only permissions and requiring user confirmation before executing file edits, running tests, or issuing bash commands, the system ensures that developers retain direct oversight of consequential operations. Write access is structurally constrained to the directory from which Claude Code is launched and its subdirectories, preventing lateral movement into parent directories without deliberate authorization. Additional protections include a sandboxed bash tool with filesystem and network isolation, a command blocklist that restricts web-fetching utilities like `curl` and `wget` by default, and fail-closed matching logic that routes unrecognized commands to manual approval rather than permitting them silently.

Prompt injection — a class of attack in which adversarial text embedded in external content attempts to redirect an AI system's behavior — receives specific architectural attention. Claude Code addresses this through context-aware analysis of full requests, input sanitization to block command injection vectors, and isolated context windows for web fetch operations, ensuring that potentially malicious content retrieved from the web cannot contaminate the primary instruction context. Trust verification requirements for first-time codebases and new Model Context Protocol (MCP) servers add another checkpoint, though this verification is intentionally disabled in non-interactive pipeline executions — a deliberate tradeoff acknowledging that automated workflows carry different threat profiles. The system's command injection detection layer also functions as a secondary review mechanism, flagging suspicious bash commands for manual approval even when they have previously been allowlisted.

The cloud execution environment introduces a separate security surface addressed through infrastructure-level controls. Each web-based session runs in an isolated, Anthropic-managed virtual machine with configurable network access restrictions, audit logging for all operations, and automatic environment teardown upon session completion. Credential protection is handled via a secure proxy that translates scoped sandbox credentials into actual GitHub authentication tokens, preventing raw credential exposure within the sandboxed environment. Remote Control sessions, by contrast, route execution back to the user's local machine with data transiting the Anthropic API over TLS, reflecting a deliberate architectural distinction between cloud-native execution and locally-anchored workflows.

The security model Claude Code embodies reflects a broader tension in agentic AI systems between operational autonomy and human oversight — a challenge that has moved to the center of applied AI safety discourse as AI coding tools become embedded in professional software development pipelines. Anthropic's approach prioritizes what its own research describes as trust calibration and access boundaries, a framework designed to prevent autonomous goal pursuit beyond explicitly authorized scope. This stands in contrast to more permissive agentic architectures that front-load trust and minimize interruptions, positioning Claude Code within a design philosophy that accepts higher confirmation overhead in exchange for reduced risk of unintended or adversarially-induced actions.

The MCP integration policy — which permits user-configured external server connections while explicitly disclaiming any Anthropic auditing or management of those servers — highlights an inherent boundary in the security perimeter. Once third-party MCP servers enter the workflow, the security posture of the broader system becomes dependent on the trustworthiness of those external providers, a gap the documentation addresses through guidance rather than technical control. This reflects an industry-wide pattern in which AI platform vendors establish strong internal security guarantees while acknowledging that the composability of modern developer toolchains introduces trust dependencies that cannot be fully governed from a single point. As agentic coding assistants deepen their integration into CI/CD pipelines, code review workflows, and scheduled automation routines, the security boundaries Claude Code establishes — and the explicit gaps it acknowledges — will likely serve as a reference point for emerging industry norms around AI agent authorization.

Read original article →