Anthropic makes a move, and the inner monologue of AI is exposed. - 36氪

Detailed Analysis

Anthropic has taken a significant step toward AI transparency by exposing what amounts to the "inner monologue" of its Claude models — a development that has drawn considerable attention from the global technology press, including major Chinese tech outlet 36氪. The move most likely refers to Anthropic's extended thinking feature, introduced with Claude 3.7 Sonnet in early 2025, which makes the model's chain-of-thought reasoning visible to users before it produces a final response. Rather than presenting only polished outputs, Claude can now surface the intermediate deliberative steps it takes — weighing options, identifying ambiguities, and reconsidering earlier conclusions — in a way that resembles a written thought process.

The significance of this development extends well beyond a product feature. For years, the opacity of large language models has been one of the central criticisms leveled at the AI industry: systems make consequential decisions without any external visibility into how those decisions are reached. By exposing the reasoning trace, Anthropic is making a direct argument that transparency and interpretability are not merely academic concerns but practical design choices that can be implemented in production systems. This is consistent with Anthropic's stated mission around AI safety, which emphasizes understanding model behavior as a prerequisite for trusting and deploying it responsibly.

The coverage in 36氪 signals the global resonance of this shift, particularly in China, where competition in frontier AI development has intensified dramatically. Chinese technology companies and researchers have been closely tracking Western AI labs' technical and product decisions, and the concept of an AI "inner monologue" — a term that carries both technical and almost philosophical weight — has obvious appeal as a framing device. It raises immediate questions about authenticity, deception, and alignment: if a model's reasoning is visible, can users better detect when it is producing inconsistent or potentially misleading outputs?

This transparency push connects to a broader trend across the AI industry toward what researchers call mechanistic interpretability — the effort to understand not just what AI models produce, but why. Anthropic has been among the most active labs in publishing interpretability research, and the extended thinking feature can be understood as a user-facing manifestation of that internal research agenda. Competitors including OpenAI have introduced similar reasoning-trace features in their o-series models, suggesting that visible chain-of-thought reasoning is becoming a competitive and regulatory expectation rather than a differentiator. Regulators in the EU and elsewhere have increasingly demanded explainability from high-stakes AI systems, making technical moves like this both commercially and legally forward-looking.

Read original article →

Detailed Analysis

Don't Miss a Deploy