Please ELI5: why does AI cost so much?

A post questions why artificial intelligence usage remains expensive after model training has been completed, with the author expressing doubt that inference computational costs alone justify ongoing pricing.

Detailed Analysis

The question of why AI remains expensive even after training is complete reflects a widespread misconception about where the costs of modern large language models actually accumulate. While training costs — running into the billions of dollars per frontier model — are significant and well-publicized, the ongoing expense of *inference*, the process by which a deployed model like Claude generates a response to each user query, represents a substantial and continuous financial burden. Every token a model like Claude Opus produces requires real-time computation across thousands of high-end GPU or TPU chips. Unlike serving a static webpage or streaming a pre-rendered video, AI inference demands active, power-intensive mathematical processing for every single interaction, and that processing does not get cheaper simply because the model finished training.

The hardware economics underlying inference are particularly stark. Cutting-edge chips from Nvidia, which dominate the AI compute market, are both scarce and extraordinarily expensive to operate at scale. Anthropic's heavier use cases — particularly agentic coding workflows through products like Claude Code — can consume compute resources that, at retail pricing, would cost thousands of dollars per heavy user per month. Reported figures suggest that some power users might represent $5,000 or more in monthly compute costs against a $200 subscription fee, with Anthropic absorbing the difference out of its investor capital as a deliberate growth strategy. The company's API pricing structure reflects this reality more transparently: Claude Opus 4.6 is priced at $5 per million input tokens and $25 per million output tokens, with additional charges for extended context windows and caching operations. These rates represent both the underlying hardware cost and a margin meant to eventually make the business sustainable.

Anthropic's response to the mounting pressure of these costs has been a structural shift in how it bills enterprise customers. According to reporting from The Information, the company has moved toward usage-based pricing for large accounts, ending the era of flat-rate subscriptions that effectively subsidized heavy compute consumption. This transition comes amid a broader compute shortage in the AI industry, where demand for GPU time consistently outpaces supply. The shift signals a maturation in how AI companies think about unit economics: the period of aggressive subsidization to acquire users is giving way to models where actual consumption is more directly tied to revenue. Critics have noted that claims of extreme per-user losses may be overstated, and that inference at scale can carry meaningful margins once infrastructure is optimized — but the direction of travel, toward more transparent cost-passing, appears clear.

The broader context situates these dynamics within a pivotal moment for the AI industry's financial model. For years, the gap between what users pay and what AI companies spend has been filled by venture capital and strategic investment — Anthropic has raised billions from investors including Google and Amazon, in part to fund this subsidization strategy. As frontier models grow more capable, they also grow more computationally demanding, which means the cost-per-interaction does not automatically decline even as chip performance improves. The tension between making AI accessible at consumer price points and sustaining the enormous infrastructure required to run it represents one of the central unsolved problems in the commercialization of generative AI. For lighter users engaging in short, simple queries, margins are relatively healthy; it is the power user segment — the developers, researchers, and enterprises running long-context, multi-step agentic tasks — where the economics remain most precarious and where pricing pressure is most acutely felt.

Read original article →

Detailed Analysis

Don't Miss a Deploy