Manipulating deformable objects like towels was long considered one of the final, hardest challenges in robotics due to their infinite variations. The fact that Figure's neural networks can now successfully fold laundry indicates that the core technological hurdles for truly general-purpose robots have been overcome.

Related Insights

Figure is observing that data from one robot performing a task (e.g., moving packages in a warehouse) improves the performance of other robots on completely different tasks (e.g., folding laundry at home). This powerful transfer learning, enabled by deep learning, is a key driver for scaling general-purpose capabilities.

Leading roboticist Ken Goldberg clarifies that while legged robots show immense progress in navigation, fine motor skills for tasks like tying shoelaces are far beyond current capabilities. This is due to challenges in sensing and handling deformable, unpredictable objects in the real world.

The dream of a do-everything humanoid is a top-down approach that will take a long time. Roboticist Ken Goldberg argues for a bottom-up strategy: master specific, valuable tasks like folding clothes or making coffee reliably first. General intelligence will emerge from combining these skills over time.

The adoption of powerful AI architectures like transformers in robotics was bottlenecked by data quality, not algorithmic invention. Only after data collection methods improved to capture more dexterous, high-fidelity human actions did these advanced models become effective, reversing the typical 'algorithm-first' narrative of AI progress.

The robotics field has a scalable recipe for AI-driven manipulation (like GPT), but hasn't yet scaled it into a polished, mass-market consumer product (like ChatGPT). The current phase focuses on scaling data and refining systems, not just fundamental algorithm discovery, to bridge this gap.

Instead of simulating photorealistic worlds, robotics firm Flexion trains its models on simplified, abstract representations. For example, it uses perception models like Segment Anything to 'paint' a door red and its handle green. By training on this simplified abstraction, the robot learns the core task (opening doors) in a way that generalizes across all real-world doors, bypassing the need for perfect simulation.

Self-driving cars, a 20-year journey so far, are relatively simple robots: metal boxes on 2D surfaces designed *not* to touch things. General-purpose robots operate in complex 3D environments with the primary goal of *touching* and manipulating objects. This highlights the immense, often underestimated, physical and algorithmic challenges facing robotics.

Surgeons perform intricate tasks without tactile feedback, relying on visual cues of tissue deformation. This suggests robotics could achieve complex manipulation by advancing visual interpretation of physical interactions, bypassing the immense difficulty of creating and integrating artificial touch sensors.

A humanoid robot with 40 joints has more potential positions than atoms in the universe (360^40). This combinatorial explosion makes it impossible to solve movement and interaction with traditional, hard-coded rules. Consequently, advanced AI like neural networks are not just an optimization but a fundamental necessity.

Unlike older robots requiring precise maps and trajectory calculations, new robots use internet-scale common sense and learn motion by mimicking humans or simulations. This combination has “wiped the slate clean” for what is possible in the field.