We scan new podcasts and send you the top 5 insights daily.
Contrary to the view that AI token intensity will drop after the initial coding boom, the move from simple queries to autonomous 'agentic' workflows will cause an order-of-magnitude (10x) increase in token usage per task. This applies across all knowledge-based jobs, ensuring sustained and explosive demand for compute.
The shift from simple chatbots (one user request, one API call) to agentic AI systems will decouple inference requests from direct user actions. A single user request could trigger hundreds or thousands of automated model calls, leading to an exponential increase in compute demand and cost.
A paradox exists where the cost for a fixed level of AI capability (e.g., GPT-4 level) has dropped 100-1000x. However, overall enterprise spend is increasing because applications now use frontier models with massive contexts and multi-step agentic workflows, creating huge multipliers on token usage that drive up total costs.
While the growth of new consumer AI users is slowing into an S-curve, the compute consumption per user is still growing exponentially. This is driven by the shift from simple queries to complex, token-intensive tasks like reasoning and agents, sustaining massive demand for GPU infrastructure.
Ben Thompson argues the shift from simple chatbots to AI agents creates an exponential, non-speculative demand for compute. Agents automate complex, multi-step tasks, driving constant usage that justifies the massive capex investments by hyperscalers. This suggests the current spending is based on real demand, not bubble-fueled speculation.
The next wave of AI compute demand won't be from generating more outputs, but from agents performing exponentially more data collection for a single task. For example, a financial model could trigger an agent to analyze vast datasets, like satellite imagery, multiplying token usage for one result.
While the cost to achieve a fixed capability level (e.g., GPT-4 at launch) has dropped over 100x, overall enterprise spending is increasing. This paradox is explained by powerful multipliers: demand for frontier models, longer reasoning chains, and multi-step agentic workflows that consume exponentially more tokens.
While user growth for apps like ChatGPT is slowing, per-user token consumption is skyrocketing as models shift from simple queries to complex reasoning and AI agents. This creates a hidden, exponential growth in compute demand, validating Oracle's massive infrastructure investment even as front-end adoption matures.
The largest driver of future energy consumption for AI won't be human-initiated queries on chatbots. Instead, it will be the massive, continuous "machine-to-machine" traffic generated by autonomous AI agents performing tasks, which will ultimately swamp human-AI interaction and create a runaway demand for compute power.
The transition from chatbots to autonomous 'agentic' AI represents a fundamental step-change. These agents, which execute complex tasks independently, have already increased the demand for computational power by 1000x, creating a massive, ongoing need for new infrastructure and hardware.
Goldman's CIO predicts that while unit cost per token will decrease, the explosion in token usage from agentic systems will make total AI compute a major corporate expense. He suggests it should be compared to personnel costs, not traditional IT spending.