Initially criticized for forgoing expensive LIDAR, Tesla's vision-based self-driving system compelled it to solve the harder, more scalable problem of AI-based reasoning. This long-term bet on foundation models for driving is now converging with the direction competitors are also taking.
The shift to AI makes multi-sensor arrays (including LiDAR) more valuable. Unlike older rules-based systems where data fusion was complex, AI models benefit directly from more diverse input data. This improves the training of the core driving model, making a multi-sensor approach with increasingly cheap LiDAR more beneficial.
During a San Francisco power outage, Waymo's map-based cars failed while Teslas were reportedly unaffected. This suggests that end-to-end AI systems are less brittle and better at handling novel "edge cases" than more rigid, heuristic-based autonomous driving models.
The latest Full Self-Driving version likely eliminates traditional `if-then` coding for a pure neural network. This leap in performance comes at the cost of human auditability, as no one can truly understand *how* the AI makes its life-or-death decisions, marking a profound shift in software.
Rivian's CEO explains that early autonomous systems, which were based on rigid rules-based "planners," have been superseded by end-to-end AI. This new approach uses a large "foundation model for driving" that can improve continuously with more data, breaking through the performance plateau of the older method.
By eschewing expensive LiDAR, Tesla lowers production costs, enabling massive fleet deployment. This scale generates exponentially more real-world driving data than competitors like Waymo, creating a data advantage that will likely lead to market dominance in autonomous intelligence.
Autonomous systems can perceive and react to dangers beyond human capability. The example of a Cybertruck autonomously accelerating to lessen the impact of a potential high-speed rear-end collision—a car the human driver didn't even see—showcases a level of predictive safety that humans cannot replicate, moving beyond simple accident avoidance.
Musk's decisions—choosing cameras over LiDAR for Tesla and acquiring X (Twitter)—are part of a unified strategy to own the largest data sets of real-world patterns (driving and human behavior). This allows him to train and perfect AI, making his companies data juggernauts.
To achieve scalable autonomy, Flywheel AI avoids expensive, site-specific setups. Instead, they offer a valuable teleoperation service today. This service allows them to profitably collect the vast, diverse datasets required to train a generalizable autonomous system, mirroring Tesla's data collection strategy.
Waive treats the sensor debate as a distraction. Their goal is to build an AI flexible enough to work with any configuration—camera-only, camera-radar, or multi-sensor. This pragmatism allows them to adapt their software to different OEM partners and vehicle price points without being locked into a single hardware ideology.
The debate over putting cameras in a robot's palm is analogous to Tesla's refusal to use LIDAR. Ken Goldberg suggests that just as LIDAR helps with edge cases in driving, in-hand cameras provide crucial, low-cost data for manipulation. Musk's purist approach may be a self-imposed handicap in both domains.