← Claude Docs

Catch security issues as Claude writes code - Claude Code Docs

Claude Docs · May 26, 2026
The security guidance plugin for Claude Code automatically detects and fixes vulnerabilities such as injection, unsafe deserialization, and unsafe DOM APIs in code changes while Claude writes, preventing security issues from reaching pull requests. The plugin operates through three review mechanisms: fast pattern matching on each file edit, background model review at turn completion, and deeper agentic review during commits or pushes. Installation requires Claude Code CLI 2.1.144 or later and Python 3.8, with optional custom security rules configurable through project settings.

Detailed Analysis

Anthropic has released a security guidance plugin for Claude Code that performs automated vulnerability detection across three distinct review layers as Claude writes and commits code. The plugin operates without user invocation, running automatically at the file-edit level through pattern matching, at the end of each conversational turn through a background model-based diff review, and at the commit or push stage through a deeper agentic review that reads surrounding code for broader context. The categories of vulnerabilities it targets include injection attacks, unsafe deserialization via tools like Python's `pickle`, DOM injection through APIs such as `innerHTML` and `dangerouslySetInnerHTML`, dynamic code execution calls, and risky GitHub Actions workflow file modifications. The plugin is installable via Anthropic's official plugin marketplace and can be scoped to individual users, specific repositories via checked-in settings, or entire organizations through managed configuration.

The architectural design of the plugin reflects a deliberate effort to address a well-documented weakness in AI-generated code: that the model producing the code is poorly positioned to objectively audit it. Anthropic addresses this through review independence — the per-edit layer is a deterministic string match requiring no model call and thus no additional usage cost, while the end-of-turn and commit layers invoke separate model instances rather than asking the same Claude instance to self-evaluate. This separation is significant because self-review by generative models tends to reproduce the same blind spots that introduced vulnerabilities in the first place. The end-of-turn review is designed to be non-blocking, running in the background after Claude's reply so as not to introduce latency into the developer workflow, while the commit-stage agentic review applies deeper contextual reasoning to reduce false positives for patterns that appear dangerous in isolation but may be safely handled elsewhere in the codebase.

The plugin is explicitly positioned as an upstream complement to existing code review and CI scanning tooling rather than a replacement. Anthropic frames the relationship as a funnel: the in-session plugin reduces the number of vulnerabilities that reach a pull request, while a separate Code Review product catches issues at the PR stage, and CI scanners handle the final layer. This layered security posture mirrors established DevSecOps thinking — shifting vulnerability detection as far left as possible in the development lifecycle to reduce remediation costs and human reviewer burden. The plugin's rate-limiting mechanisms, including caps on repeat warnings per file per session, a 30-file-per-turn diff limit, and a 20-commit-per-hour cap on agentic reviews, suggest Anthropic is managing compute costs while preventing alert fatigue that could cause developers to dismiss findings.

The release reflects a broader industry shift in how AI coding tools are being evaluated and deployed in professional software engineering contexts. As tools like Claude Code, GitHub Copilot, and similar assistants move from experimental to production use in enterprise environments, security has emerged as a primary concern among engineering and security teams. Research has consistently shown that AI-generated code carries vulnerability rates comparable to or, in some studies, exceeding those of human-written code, particularly in categories like injection and authentication bypass. Anthropic's response — embedding security review natively into the agentic coding loop rather than treating it as an external gate — represents a maturation in product philosophy, acknowledging that developers using AI assistants for extended agentic tasks cannot be expected to manually audit every code change the model produces. The extensibility of the plugin through custom `security-patterns.yaml` rules further signals that Anthropic is targeting teams with domain-specific security requirements who need to enforce internal coding standards beyond the built-in checks.

Read original article →