Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

While the cost per AI query drops, companies find more complex, compute-intensive uses for it. This elasticity of demand means total AI spending becomes a significant and variable operational expense, similar to a utility bill, rather than a predictable software cost.

Related Insights

While the cost-per-token is decreasing as models become more efficient, this efficiency gain drives a massive increase in new use cases and overall consumption. This economic principle, Jevons Paradox, explains why total enterprise spending on model inference is skyrocketing, even as the unit cost falls.

Flat-rate AI plans are becoming economically unviable due to token-hungry agents. Companies like Google and Microsoft are pushing usage-based billing, forcing enterprises to confront the surprisingly high real cost of running models at scale, which was previously hidden by subsidized pricing experiments.

Contrary to the belief that enterprises have unlimited budgets, they are focused on the ROI of their AI spend. As agentic workflows cause token bills to skyrocket, orchestration tools that intelligently route queries to the most cost-effective model for a given task are becoming essential infrastructure.

The end of subsidized AI pricing is forcing companies to confront its true operational expense. As AI bills begin to rival payroll, a fundamental transition is occurring where capital expenditure on silicon (CapEx) is displacing operational expenditure on human neurons (OpEx), reshaping corporate budgets.

A paradox exists where the cost for a fixed level of AI capability (e.g., GPT-4 level) has dropped 100-1000x. However, overall enterprise spend is increasing because applications now use frontier models with massive contexts and multi-step agentic workflows, creating huge multipliers on token usage that drive up total costs.

While the per-unit cost of using AI has plummeted, total enterprise spending has soared. This is a classic example of the Jevons paradox: efficiency gains and lower prices are unlocking entirely new use cases that were previously uneconomical, leading to a net increase in overall consumption and total expenditure.

Software has long commanded premium valuations due to near-zero marginal distribution costs. AI breaks this model. The significant, variable cost of inference means expenses scale with usage, fundamentally altering software's economic profile and forcing valuations down toward those of traditional industries.

While the cost to achieve a fixed capability level (e.g., GPT-4 at launch) has dropped over 100x, overall enterprise spending is increasing. This paradox is explained by powerful multipliers: demand for frontier models, longer reasoning chains, and multi-step agentic workflows that consume exponentially more tokens.

While the cost for GPT-4 level intelligence has dropped over 100x, total enterprise AI spend is rising. This is driven by multipliers: using larger frontier models for harder tasks, reasoning-heavy workflows that consume more tokens, and complex, multi-turn agentic systems.

Goldman's CIO predicts that while unit cost per token will decrease, the explosion in token usage from agentic systems will make total AI compute a major corporate expense. He suggests it should be compared to personnel costs, not traditional IT spending.

Enterprise AI Costs Act Like Electricity, Rising with Use Despite Cheaper Queries | RiffOn