We scan new podcasts and send you the top 5 insights daily.
AI workloads, particularly for research and evals, don't follow predictable "follow-the-sun" patterns. They are extremely spiky, demanding massive compute resources instantly (e.g., 100,000 CPUs) and then dropping to zero. This forces providers like Daytona to maintain low mean utilization (15%) to handle unpredictable peaks.
The industry is fixated on the GPU shortage, but the proliferation of AI agents will create massive demand for general-purpose compute, leading to a CPU bottleneck. As millions of agents perform tasks, the availability of CPU cores—not just specialized processors—will become the primary constraint on growth for compute providers.
Unlike human-driven growth, which is limited by population and waking hours, AI agents can operate, replicate, and call each other endlessly. This creates a potentially infinite demand for compute infrastructure, far exceeding previous models and leading to massive, unpredictable strains on providers.
The intense computational demand and latency of AI models are compelling enterprises to use multiple cloud providers. Rather than vendor loyalty, companies now prioritize performance, switching between clouds like AWS and Azure to find the fastest available capacity for their AI workloads, reshaping the cloud market.
While GPUs train models, CPUs are essential for two key workloads: running reinforcement learning environments and executing the code generated by AI. This has created a massive, often overlooked demand spike, making CPUs a critical, sold-out component in the AI infrastructure stack and a hidden bottleneck.
The focus on GPUs for AI overlooks a critical bottleneck: a growing CPU shortage. AI agents rely heavily on CPUs for orchestration tasks like tool calls, database queries, and web searches. This hidden demand is causing hyperscalers to lock in multi-year CPU supply contracts.
The AI boom has created such desperation for power that hyperscalers now prioritize immediate availability ('time to power') above all else. Cost has become a secondary concern, and sustainability, once a key objective, has fallen far lower on the priority list.
A speaker theorizes that increased cloud outages are not random. Cloud providers, rushing to buy GPUs for AI, have underinvested in refreshing their general-purpose CPU infrastructure. With CPUs now hitting their 5-year end-of-life and new AI-related CPU demand rising, the system is becoming strained and unstable.
After the current memory crunch, the next AI infrastructure bottleneck will be CPU and networking. The complex orchestration required for emerging agentic AI systems will strain these resources, a trend already visible in companies like Fastly seeing demand spikes just for workload orchestration.
While GPUs get the headlines, AI expert Tae Kim warns of a major coming CPU shortage. The complex orchestration, tool calls, and database queries required by AI agents are creating huge demand for CPU cores, a trend confirmed by major chipmakers and hyperscalers.
A major paradox exists in AI development: companies are desperate for scarce GPUs, yet often fail to use them efficiently. Even well-funded labs like XAI report model flops utilization as low as 11%, far below the 40% practical target, due to inconsistent workloads and data transfer bottlenecks.