While the cost for GPT-4 level intelligence has dropped over 100x, total enterprise AI spend is rising. This is driven by multipliers: using larger frontier models for harder tasks, reasoning-heavy workflows that consume more tokens, and complex, multi-turn agentic systems.
For the first time in years, the perceived leap in LLM capabilities has slowed. While models have improved, the cost increase (from $20 to $200/month for top-tier access) is not matched by a proportional increase in practical utility, suggesting a potential plateau or diminishing returns.
AI companies operate under the assumption that LLM prices will trend towards zero. This strategic bet means they intentionally de-prioritize heavy investment in cost optimization today, focusing instead on capturing the market and building features, confident that future, cheaper models will solve their margin problems for them.
The cost for a given level of AI capability has decreased by a factor of 100 in just one year. This radical deflation in the price of intelligence requires a complete rethinking of business models and future strategies, as intelligence becomes an abundant, cheap commodity.
A paradox exists where the cost for a fixed level of AI capability (e.g., GPT-4 level) has dropped 100-1000x. However, overall enterprise spend is increasing because applications now use frontier models with massive contexts and multi-step agentic workflows, creating huge multipliers on token usage that drive up total costs.
While the per-unit cost of using AI has plummeted, total enterprise spending has soared. This is a classic example of the Jevons paradox: efficiency gains and lower prices are unlocking entirely new use cases that were previously uneconomical, leading to a net increase in overall consumption and total expenditure.
The cost of AI, priced in "tokens by the drink," is falling dramatically. All inputs are on a downward cost curve, leading to a hyper-deflationary effect on the price of intelligence. This, in turn, fuels massive demand elasticity as more use cases become economically viable.
A massive budget shift is underway where companies spend exponentially more on AI agents than on foundational software like CRM. One small team spends $500k annually on AI agents versus just $10k on Salesforce, signaling a tectonic shift in software value and spending priorities.
Even for complex, multi-hour tasks requiring millions of tokens, current AI agents are at least an order of magnitude cheaper than paying a human with relevant expertise. This significant cost advantage suggests that economic viability will not be a near-term bottleneck for deploying AI on increasingly sophisticated tasks.
AI's computational needs are not just from initial training. They compound exponentially due to post-training (reinforcement learning) and inference (multi-step reasoning), creating a much larger demand profile than previously understood and driving a billion-X increase in compute.
Countering the narrative of insurmountable training costs, Jensen Huang argues that architectural, algorithmic, and computing stack innovations are driving down AI costs far faster than Moore's Law. He predicts a billion-fold cost reduction for token generation within a decade.