We scan new podcasts and send you the top 5 insights daily.
The recent explosion in AI agent usage is a key driver behind the massive funding rounds for inference providers like Base10. Agents, which can be autonomous and perform complex tasks, "gobble up" significantly more compute resources and tokens than previous AI applications, directly boosting revenue for the companies that run the underlying models.
Contrary to the view that AI token intensity will drop after the initial coding boom, the move from simple queries to autonomous 'agentic' workflows will cause an order-of-magnitude (10x) increase in token usage per task. This applies across all knowledge-based jobs, ensuring sustained and explosive demand for compute.
Flat-rate AI plans are becoming economically unviable due to token-hungry agents. Companies like Google and Microsoft are pushing usage-based billing, forcing enterprises to confront the surprisingly high real cost of running models at scale, which was previously hidden by subsidized pricing experiments.
Initial AI market skepticism was based on a SaaS model of selling limited-value subscriptions ('seats'). The new reality is a utility model based on consumption ('tokens'). In an agentic era, a single user can drive thousands of dollars in token usage, creating a virtually uncapped revenue stream that justifies massive infrastructure investment.
Valuations of AI companies may be artificially low because they're based on the token demand for simple chatbots. The real, explosive growth comes from reasoning models, agents, and multimodal generation, creating a near-infinite demand for tokens that is not yet priced in.
Ben Thompson argues the shift from simple chatbots to AI agents creates an exponential, non-speculative demand for compute. Agents automate complex, multi-step tasks, driving constant usage that justifies the massive capex investments by hyperscalers. This suggests the current spending is based on real demand, not bubble-fueled speculation.
The next wave of AI compute demand won't be from generating more outputs, but from agents performing exponentially more data collection for a single task. For example, a financial model could trigger an agent to analyze vast datasets, like satellite imagery, multiplying token usage for one result.
The massive spike in demand for AI tokens is a direct result of the shift from users performing simple, assisted tasks to deploying autonomous agents. A single individual can now consume billions of tokens via agents running on their behalf, overwhelming the current supply of compute.
The next wave of AI adoption involves 'agentic' workflows, where AI performs complex tasks autonomously. This shift from simple queries to agentic use is expected to increase token consumption by approximately 10x per task. This will drive a massive explosion in compute demand across all knowledge-work industries, not just coding.
The success of personal AI assistants signals a massive shift in compute usage. While training models is resource-intensive, the next 10x in demand will come from widespread, continuous inference as millions of users run these agents. This effectively means consumers are buying fractions of datacenter GPUs like the GB200.
AI agents burn tokens at a much higher rate than anticipated. This unforeseen compute cost is the direct catalyst for labs like Anthropic and OpenAI killing popular products and overhauling their pricing structures.