Moving a robot from a lab demo to a commercial system reveals that AI is just one component. Success depends heavily on traditional engineering for sensor calibration, arm accuracy, system speed, and reliability. These unglamorous details are critical for performance in the real world.
Leading roboticist Ken Goldberg clarifies that while legged robots show immense progress in navigation, fine motor skills for tasks like tying shoelaces are far beyond current capabilities. This is due to challenges in sensing and handling deformable, unpredictable objects in the real world.
Ken Goldberg's company, Ambi Robotics, successfully uses simple suction cups for logistics. He argues that the industry's focus on human-like hands is misplaced, as simpler grippers are more practical, reliable, and capable of performing immensely complex tasks today.
While autonomous driving is complex, roboticist Ken Goldberg argues it's an easier problem than dexterous manipulation. Driving fundamentally involves avoiding contact with objects, whereas manipulation requires precisely controlled contact and interaction with them, a much harder challenge.
Surgeons perform intricate tasks without tactile feedback, relying on visual cues of tissue deformation. This suggests robotics could achieve complex manipulation by advancing visual interpretation of physical interactions, bypassing the immense difficulty of creating and integrating artificial touch sensors.
Despite testing with countless objects, Ambi Robotics discovered their system struggled with a common item they hadn't prioritized: plastic shipping bags. Bags fold and lose suction unpredictably, highlighting how real-world deployment uncovers critical edge cases that extensive lab testing misses.
Ken Goldberg quantifies the challenge: the text data used to train LLMs would take a human 100,000 years to read. Equivalent data for robot manipulation (vision-to-control signals) doesn't exist online and must be generated from scratch, explaining the slower progress in physical AI.
The dream of a do-everything humanoid is a top-down approach that will take a long time. Roboticist Ken Goldberg argues for a bottom-up strategy: master specific, valuable tasks like folding clothes or making coffee reliably first. General intelligence will emerge from combining these skills over time.
The debate over putting cameras in a robot's palm is analogous to Tesla's refusal to use LIDAR. Ken Goldberg suggests that just as LIDAR helps with edge cases in driving, in-hand cameras provide crucial, low-cost data for manipulation. Musk's purist approach may be a self-imposed handicap in both domains.
