Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

For chip founders leaving labs like Google, a primary risk is the "trust boundary." They lose visibility into next-gen model architectures, critical for systems co-design. This creates a danger of spending two years taping out a chip that is already obsolete for the models being developed when it finally hits the market.

Related Insights

The 2017 "Attention Is All You Need" paper, written by eight Google researchers, laid the groundwork for modern LLMs. In a striking example of the innovator's dilemma, every author left Google within a few years to start or join other AI companies, representing a massive failure to retain pivotal talent at a critical juncture.

By outsourcing core AI models to Google, Apple saves on R&D but loses deep expertise in the technology that will define future devices. This dependency could hinder its ability to create tightly integrated, next-generation hardware, which has historically been its primary competitive advantage.

New AI models are designed to perform well on available, dominant hardware like NVIDIA's GPUs. This creates a self-reinforcing cycle where the incumbent hardware dictates which model architectures succeed, making it difficult for superior but incompatible chip designs to gain traction.

Startups can make big bets on emerging workloads, like LLMs before they were proven. This is a product risk. In contrast, incumbents like Google or NVIDIA must ensure their next chip serves a wide range of existing customers, forcing them to be more conservative and avoid disruptive product bets.

For companies like NVIDIA or Google, moving from TSMC to Intel or Samsung is not a simple supplier switch. It necessitates a complete redesign of the chip's architecture to fit the new foundry's technology. This complex and costly process can take one to two years, making it a last resort.

The pace of AI development means a startup's competitive advantage can be erased overnight by the next model release from a major lab like Google or Anthropic. Dr. el Kaliouby stresses that true defensibility now requires more than just a proprietary algorithm; it demands unique data, distribution, or IP that cannot be easily replicated.

Designing custom AI hardware is a long-term bet. Google's TPU team co-designs chips with ML researchers to anticipate future needs. They aim to build hardware for the models that will be prominent 2-6 years from now, sometimes embedding speculative features that could provide massive speedups if research trends evolve as predicted.

For a hyperscaler, the main benefit of designing a custom AI chip isn't necessarily superior performance, but gaining control. It allows them to escape the supply allocations dictated by NVIDIA and chart their own course, even if their chip is slightly less performant or more expensive to deploy.

For its next-generation V7 TPU AI chip, Google is diversifying its supply chain. It's retaining incumbent Broadcom for the complex 'training' version while bringing in low-cost entrant Mediatek for the 'inference' version. This sophisticated strategy mitigates supply risk while keeping critical IP with a trusted partner.

True co-design between AI models and chips is currently impossible due to an "asymmetric design cycle." AI models evolve much faster than chips can be designed. By using AI to drastically speed up chip design, it becomes possible to create a virtuous cycle of co-evolution.

Chip Designers Leaving Big Labs Face a Critical "Trust Boundary" Risk | RiffOn