Claude Code Read tool silently downscales images

Claude's Code Read tool downscales images before processing without warning users about this degradation. High-resolution retina screenshots containing clearly readable text produced confident but vague responses, revealing the model analyzed significantly compressed versions of the originals. The tool's results appear indistinguishable from normal operation, making it difficult to determine whether model limitations stem from image compression or actual capability.

Detailed Analysis

Claude Code's Read tool silently downscales images before passing them to the underlying model, creating a transparency gap that leaves users unaware their visual inputs have been degraded. A Reddit user discovered this behavior while asking Claude Opus 4.7 to extract text from ten retina-resolution screenshots — images that were clearly legible on their monitor. The model returned structurally confident but incomplete answers, and only after the user pressed further did the nature of the problem become apparent: the Read tool had resized the images before the model ever processed them, meaning the model and the user were analyzing fundamentally different versions of the same file. Critically, the tool provides no indication that downscaling has occurred; its output is visually indistinguishable from a successful full-resolution analysis, offering no signal of degraded input.

According to technical documentation and open GitHub issues, the downscaling behavior is an intentional architectural constraint rather than a bug. Claude Code resizes images client-side to comply with model resolution limits — Claude Opus 4.7 supports a maximum of 2,576 pixels on the long edge — and to prevent session crashes that can occur when processing large batches of high-resolution screenshots. Images exceeding approximately 2,000 pixels in either dimension are automatically resized before being sent to the API, with aspect ratios preserved and dimensions padded to multiples of 28 pixels. While this engineering decision protects session stability and avoids API rejections, it introduces a silent degradation layer that most users have no awareness of and no mechanism to detect.

The practical consequences are significant, particularly for use cases involving dense visual information such as retina screenshots, data visualizations, or documents with small text. When a model processes a downscaled version of an image it has not flagged as downscaled, it may generate plausible-sounding but inaccurate responses — a form of hallucination seeded not by model failure but by input corruption. The user's concern that they may have been receiving hallucinated answers across many prior interactions is well-founded: any workflow that relied on Claude Code's Read tool for precise visual text extraction from high-DPI images would have been systematically exposed to this degradation without any audit trail.

This issue reflects a broader challenge in agentic AI tooling around what might be called "silent transformation" — the gap between what a user believes they are submitting to a model and what the model actually receives. As AI coding assistants like Claude Code increasingly mediate between users and underlying model APIs, each layer in that stack introduces potential transformations that compound without user visibility. The absence of a transparency mechanism — even something as simple as a logged notice stating "image resized from 3840×2160 to 2576×1449 before analysis" — represents a design choice that prioritizes seamlessness over auditability, which is an especially problematic tradeoff in contexts where accuracy is the primary goal.

The disclosure of this behavior underscores growing community pressure on AI tool developers to surface operational constraints that affect output quality. Anthropic's own vision API documentation recommends resizing images below model limits before submission for both latency and accuracy reasons, effectively acknowledging that downscaling degrades results — yet this guidance exists in developer-facing documentation rather than as an in-context warning within Claude Code itself. As multimodal AI workflows become more deeply embedded in professional and technical pipelines, the expectation that users will proactively consult API documentation before every screenshot analysis is increasingly untenable. The conversation signals demand for tools that make their own limitations legible in real time.

Read original article →

Detailed Analysis

Don't Miss a Deploy