Nvidia will likely only revive its ambitions to compete with AWS if its massive hardware profit margins are threatened by competitors like AMD or hyperscalers building their own chips. Only then would Nvidia move up the stack to capture value through an "inference as a service" business model, moving beyond hardware sales.

Related Insights

Firms like OpenAI and Meta claim a compute shortage while also exploring selling compute capacity. This isn't a contradiction but a strategic evolution. They are buying all available supply to secure their own needs and then arbitraging the excess, effectively becoming smaller-scale cloud providers for AI.

By funding and backstopping CoreWeave, which exclusively uses its GPUs, NVIDIA establishes its hardware as the default for the AI cloud. This gives NVIDIA leverage over major customers like Microsoft and Amazon, who are developing their own chips. It makes switching to proprietary silicon more difficult, creating a competitive moat based on market structure, not just technology.

Nvidia's staggering revenue growth and 56% net profit margins are a direct cost to its largest customers (AWS, Google, OpenAI). This incentivizes them to form a defacto alliance to develop and adopt alternative chips to commoditize the accelerator market and reclaim those profits.

While NVIDIA's CUDA software provides a powerful lock-in for AI training, its advantage is much weaker in the rapidly growing inference market. New platforms are demonstrating that developers can and will adopt alternative software stacks for deployment, challenging the notion of an insurmountable software moat.

While Nvidia dominates the AI training chip market, this only represents about 1% of the total compute workload. The other 99% is inference. Nvidia's risk is that competitors and customers' in-house chips will create cheaper, more efficient inference solutions, bifurcating the market and eroding its monopoly.

Even if Google's TPU doesn't win significant market share, its existence as a viable alternative gives large customers like OpenAI critical leverage. The mere threat of switching to TPUs forces NVIDIA to offer more favorable terms, such as discounts or strategic equity investments, effectively capping its pricing power.

Nvidia retreated from building its own cloud service due to the difficulty and unreliability of its 'cloud of clouds' model, which leased competitor infrastructure. It has now pivoted to a less complex marketplace model, connecting customers to smaller cloud providers instead.

In a power-constrained world, total cost of ownership is dominated by the revenue a data center can generate per watt. A superior NVIDIA system producing multiples more revenue makes the hardware cost irrelevant. A competitor's chip would be rejected even if free due to the high opportunity cost.

Major AI labs aren't just evaluating Google's TPUs for technical merit; they are using the mere threat of adopting a viable alternative to extract significant concessions from Nvidia. This strategic leverage forces Nvidia to offer better pricing, priority access, or other favorable terms to maintain its market dominance.

The narrative of endless demand for NVIDIA's high-end GPUs is flawed. It will be cracked by two forces: the shift of AI inference to on-device flash memory, reducing cloud reliance, and Google's ability to give away its increasingly powerful Gemini AI for free, undercutting the revenue models that fuel GPU demand.