We scan new podcasts and send you the top 5 insights daily.
Emerging cloud providers (“NeoClouds”) are sticking exclusively with NVIDIA, despite alternatives from AMD. The perceived performance risk is too high, as customers demand state-of-the-art inference speed and providers can't risk a multi-billion dollar investment on a non-NVIDIA stack that might offer lower throughput.
As chip manufacturers like NVIDIA release new hardware, inference providers like Base10 absorb the complexity and engineering effort required to optimize AI models for the new chips. This service is a key value proposition, saving customers from the challenging process of re-optimizing workloads for new hardware.
By funding and backstopping CoreWeave, which exclusively uses its GPUs, NVIDIA establishes its hardware as the default for the AI cloud. This gives NVIDIA leverage over major customers like Microsoft and Amazon, who are developing their own chips. It makes switching to proprietary silicon more difficult, creating a competitive moat based on market structure, not just technology.
The Rubin family of chips is sold as a complete "system as a rack," meaning customers can't just swap out old GPUs. This technical requirement creates a forced, expensive upgrade cycle for cloud providers, compelling them to invest heavily in entirely new rack systems to stay competitive.
New AI models are designed to perform well on available, dominant hardware like NVIDIA's GPUs. This creates a self-reinforcing cycle where the incumbent hardware dictates which model architectures succeed, making it difficult for superior but incompatible chip designs to gain traction.
OpenAI and Oracle canceled a major data center expansion because it wouldn't be ready before Nvidia's next-generation "Vera Rubin" chips arrived. This reveals a key operational strategy: OpenAI wants to avoid mixing different GPU generations within its large-scale AI training campuses for maximum efficiency.
The competitive landscape for AI chips is not a crowded field but a battle between two primary forces: NVIDIA’s integrated system (hardware, software, networking) and Google's TPU. Other players like AMD and Broadcom are effectively a combined secondary challenger offering an open alternative.
Nvidia will likely only revive its ambitions to compete with AWS if its massive hardware profit margins are threatened by competitors like AMD or hyperscalers building their own chips. Only then would Nvidia move up the stack to capture value through an "inference as a service" business model, moving beyond hardware sales.
A new category of cloud providers, "NeoClouds," are built specifically for high-performance GPU workloads. Unlike traditional clouds like AWS, which were retrofitted from a CPU-centric architecture, NeoClouds offer superior performance for AI tasks by design and through direct collaboration with hardware vendors like NVIDIA.
OpenAI's deal structures highlight the market's perception of chip providers. NVIDIA commanded a direct investment from OpenAI to secure its chips (a premium). In contrast, AMD had to offer equity warrants to OpenAI to win its business (a discount), reflecting their relative negotiating power.
Newer AI cloud providers gain a performance advantage by building their infrastructure entirely on NVIDIA's integrated ecosystem, including specialized networking. Incumbent clouds often must patch their legacy, CPU-centric systems, creating inefficiencies that 'neo-clouds' without technical debt can avoid.