The debate over putting cameras in a robot's palm is analogous to Tesla's refusal to use LIDAR. Ken Goldberg suggests that just as LIDAR helps with edge cases in driving, in-hand cameras provide crucial, low-cost data for manipulation. Musk's purist approach may be a self-imposed handicap in both domains.

Related Insights

Ken Goldberg's company, Ambi Robotics, successfully uses simple suction cups for logistics. He argues that the industry's focus on human-like hands is misplaced, as simpler grippers are more practical, reliable, and capable of performing immensely complex tasks today.

Elon Musk's newly approved trillion-dollar pay package is less about the money and more about securing 25% voting control of Tesla. He views Tesla's future not in cars but in humanoid robots, and he sought this control to direct the development of this potentially world-changing technology.

While autonomous driving is complex, roboticist Ken Goldberg argues it's an easier problem than dexterous manipulation. Driving fundamentally involves avoiding contact with objects, whereas manipulation requires precisely controlled contact and interaction with them, a much harder challenge.

By eschewing expensive LiDAR, Tesla lowers production costs, enabling massive fleet deployment. This scale generates exponentially more real-world driving data than competitors like Waymo, creating a data advantage that will likely lead to market dominance in autonomous intelligence.

While Figure's CEO criticizes competitors for using human operators in robot videos, this 'wizard of oz' technique is a critical data-gathering and development stage. Just as early Waymo cars had human operators, teleoperation is how companies collect the training data needed for true autonomy.

Ken Goldberg quantifies the challenge: the text data used to train LLMs would take a human 100,000 years to read. Equivalent data for robot manipulation (vision-to-control signals) doesn't exist online and must be generated from scratch, explaining the slower progress in physical AI.

Musk's decisions—choosing cameras over LiDAR for Tesla and acquiring X (Twitter)—are part of a unified strategy to own the largest data sets of real-world patterns (driving and human behavior). This allows him to train and perfect AI, making his companies data juggernauts.

Waive treats the sensor debate as a distraction. Their goal is to build an AI flexible enough to work with any configuration—camera-only, camera-radar, or multi-sensor. This pragmatism allows them to adapt their software to different OEM partners and vehicle price points without being locked into a single hardware ideology.

Self-driving cars, a 20-year journey so far, are relatively simple robots: metal boxes on 2D surfaces designed *not* to touch things. General-purpose robots operate in complex 3D environments with the primary goal of *touching* and manipulating objects. This highlights the immense, often underestimated, physical and algorithmic challenges facing robotics.

Classical robots required expensive, rigid, and precise hardware because they were blind. Modern AI perception acts as 'eyes', allowing robots to correct for inaccuracies in real-time. This enables the use of cheaper, compliant, and inherently safer mechanical components, fundamentally changing hardware design philosophy.