A "vanilla" end-to-end model is insufficient for safety-critical systems. Waymo's foundation model is end-to-end but is augmented with "structured materialized intermediate representation." This allows for crucial runtime validation, richer training, and closed-loop evaluation necessary for superhuman performance at scale.
Dmitri Dolgov explains that while AI advancements create hype, they primarily speed up progress on the initial, easier parts of a problem. They don't change the "long tail" of complex, rare edge cases, which remains the core challenge in achieving full, superhuman autonomy.
Unlike typical tech development that focuses on capabilities first, Waymo embeds safety as a "non-negotiable foundation" from the start. This means building safety into the model architecture and team mindset, as the approach to achieving 90% performance is fundamentally different from reaching the final "nines" of reliability.
In its formative years as a Google project, a dozen-person team made extreme progress by having everyone do everything: writing code, building hardware, calibrating sensors, and testing at night. This "crazy startup" model of universal contribution and rapid learning was key to solving the initial, seemingly impossible challenges.
Waymo achieved exponential growth by changing its core strategy. After years of methodically de-risking technology in a sequential manner, the company transitioned to a model of "rapid parallel global commercialization." This shift is what enabled them to launch in four new cities in a single day, a feat that previously took eight years.
Dolgov shared a story where a Waymo vehicle reacted to a hidden pedestrian. The system's LIDAR captured sparse returns from the person's feet moving under a bus. This sliver of data was enough for the AI to not only detect the person but also predict their future path, demonstrating an emergent, superhuman capability.
