Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Baidu's rationale for developing its own silicon isn't to control the supply chain or dominate pre-training. It's a strategic focus on the AI inference market, which the CFO states accounts for 80% of incremental compute demand. Their chips are optimized for this specific task, creating a positive network effect with their cloud business.

Related Insights

Intel is using less expensive LPDDR memory in its new AI chip to compete on cost in the inference market, not performance in the training market dominated by Nvidia. This niche strategy aims to capture cost-sensitive customers and potentially the restricted China market.

Despite being a full-stack AI player, Baidu's CFO identifies the cloud as the most critical layer. It serves as the central platform for deploying not only their own model (Ernie) but also third-party models, making it the key to monetization, inference deployment, and overall ecosystem control.

Tech giants often initiate custom chip projects not with the primary goal of mass deployment, but to create negotiating power against incumbents like NVIDIA. The threat of a viable alternative is enough to secure better pricing and allocation, making the R&D cost a strategic investment.

For a hyperscaler, the main benefit of designing a custom AI chip isn't necessarily superior performance, but gaining control. It allows them to escape the supply allocations dictated by NVIDIA and chart their own course, even if their chip is slightly less performant or more expensive to deploy.

Despite its high valuation post-IPO, AI chipmaker Cerebras's long-term strategy focuses on inference, not just training. The bet is that inference will become a much larger segment of the AI compute market. By developing chips specifically optimized for this task, Cerebras aims to take significant market share from NVIDIA.

The primary driver for companies like Microsoft designing their own AI chips is economic. When 80 cents of every R&D dollar goes to a single vendor like Nvidia, creating custom silicon becomes a strategic imperative to control unit economics and reduce supply chain dependency.

Unlike general-purpose NVIDIA GPUs, Microsoft's custom Maya 200 chip focuses specifically on running existing AI models (inference). Microsoft claims this makes it cheaper for certain tasks, like its own Copilot tools, creating a cost-saving value proposition for potential customers like Anthropic.

The AI hardware market is splitting into two distinct segments: training and inference. While NVIDIA dominates training, the larger, long-term opportunity lies in inference. This is creating a market for specialized, memory-optimized chips from companies like Cerebras and Grok designed for running models efficiently.

At a massive scale, chip design economics flip. For a $1B training run, the potential efficiency savings on compute and inference can far exceed the ~$200M cost to develop a custom ASIC for that specific task. The bottleneck becomes chip production timelines, not money.

As AI models become commodities, the underlying hardware's speed and efficiency for inference is the true differentiator. The company that powers the fastest AI experiences will win, similar to how Google won with fast search, because there is no market for slow AI.

Baidu's Custom Chip Strategy Targets AI Inference, Not Pre-Training Dominance | RiffOn