While large language models (LLMs) converge by training on the same public internet data, autonomous driving models will remain distinct. Each company must build its own proprietary dataset from its unique sensor stack and vehicle fleet. This lack of a shared data foundation means different automakers' AI driving behaviors and capabilities will likely diverge over time.

Related Insights

The rapid progress of many LLMs was possible because they could leverage the same massive public dataset: the internet. In robotics, no such public corpus of robot interaction data exists. This “data void” means progress is tied to a company's ability to generate its own proprietary data.

The neural nets powering autonomous vehicles are highly generalizable, with 80-90% of the underlying software being directly applicable to other verticals like trucking. A company's long-term value lies in its scaled driving data and core AI competency, not its initial target market.

RJ Scaringe argues that successful, neural net-based autonomy requires a rare combination of ingredients: full control of the perception stack, a large vehicle fleet for data collection, massive capital, and GPU access. He believes only a handful of companies, including Rivian, Tesla, and Waymo, possess all the necessary components to compete.

Rivian's CEO explains that early autonomous systems, which were based on rigid rules-based "planners," have been superseded by end-to-end AI. This new approach uses a large "foundation model for driving" that can improve continuously with more data, breaking through the performance plateau of the older method.

By eschewing expensive LiDAR, Tesla lowers production costs, enabling massive fleet deployment. This scale generates exponentially more real-world driving data than competitors like Waymo, creating a data advantage that will likely lead to market dominance in autonomous intelligence.

Musk's decisions—choosing cameras over LiDAR for Tesla and acquiring X (Twitter)—are part of a unified strategy to own the largest data sets of real-world patterns (driving and human behavior). This allows him to train and perfect AI, making his companies data juggernauts.

Initially criticized for forgoing expensive LIDAR, Tesla's vision-based self-driving system compelled it to solve the harder, more scalable problem of AI-based reasoning. This long-term bet on foundation models for driving is now converging with the direction competitors are also taking.

Despite rapid software advances like deep learning, the deployment of self-driving cars was a 20-year process because it had to integrate with the mature automotive industry's supply chains, infrastructure, and business models. This serves as a reminder that AI's real-world impact is often constrained by the readiness of the sectors it aims to disrupt.

Waive's core strategy is generalization. By training a single, large AI on diverse global data, vehicles, and sensor sets, they can adapt to new cars and countries in months, not years. This avoids the AV 1.0 pitfall of building bespoke, infrastructure-heavy solutions for each new market.

As algorithms become more widespread, the key differentiator for leading AI labs is their exclusive access to vast, private data sets. XAI has Twitter, Google has YouTube, and OpenAI has user conversations, creating unique training advantages that are nearly impossible for others to replicate.