We scan new podcasts and send you the top 5 insights daily.
A pure "pixels-in, actions-out" model is insufficient for full autonomy. While easy to start, this approach is extremely inefficient to simulate and validate for safety-critical edge cases. Waymo augments its end-to-end system with intermediate representations (like objects and road signs) to make simulation and validation tractable.
Major AI breakthroughs like Transformers accelerate initial progress but are not silver bullets for the safety-critical long tail. The nature of the problem is that getting a prototype working is relatively easy, but achieving the final "nines" of reliability is incredibly difficult, justifying Google's early, multi-decade investment.
During a San Francisco power outage, Waymo's map-based cars failed while Teslas were reportedly unaffected. This suggests that end-to-end AI systems are less brittle and better at handling novel "edge cases" than more rigid, heuristic-based autonomous driving models.
Waymo's co-CEO argues that Level 4/5 autonomy will not emerge by incrementally improving Level 2/3 driver-assist systems. The hardest challenges of operating without a human driver are entirely absent in assist systems, requiring a "qualitative jump" and a completely different approach from the outset.
To address safety concerns of an end-to-end "black box" self-driving AI, NVIDIA runs it in parallel with a traditional, transparent software stack. A "safety policy evaluator" then decides which system to trust at any moment, providing a fallback to a more predictable system in uncertain scenarios.
Rivian's CEO explains that early autonomous systems, which were based on rigid rules-based "planners," have been superseded by end-to-end AI. This new approach uses a large "foundation model for driving" that can improve continuously with more data, breaking through the performance plateau of the older method.
The AI's ability to handle novel situations isn't just an emergent property of scale. Waive actively trains "world models," which are internal generative simulators. This enables the AI to reason about what might happen next, leading to sophisticated behaviors like nudging into intersections or slowing in fog.
Waymo uses a foundation model to create specialized, high-capacity "teacher" models (Driver, Simulator, Critic) offline. These teachers then distill their knowledge into smaller, efficient "student" models that can run in real-time on the vehicle, balancing massive computational power with on-device constraints.
Initially criticized for forgoing expensive LIDAR, Tesla's vision-based self-driving system compelled it to solve the harder, more scalable problem of AI-based reasoning. This long-term bet on foundation models for driving is now converging with the direction competitors are also taking.
Creating realistic training environments isn't blocked by technical complexity—you can simulate anything a computer can run. The real bottleneck is the financial and computational cost of the simulator. The key skill is strategically mocking parts of the system to make training economically viable.
The winning vehicle in the 2005 DARPA self-driving challenge, led by future Waymo founder Sebastian Thrun, used a clever machine learning approach. It overlaid precise laser sensor data onto a regular video camera feed, teaching the system to recognize the color and texture of "safe" terrain and extrapolate a drivable path far ahead.