Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Figure founder Brett Adcock's new lab, Hark, is developing both new multimodal AI models and next-generation hardware interfaces. The thesis is that a true "Jarvis-like" AI experience requires fundamental breakthroughs in both the underlying intelligence and the physical devices we use to interact with it.

Related Insights

While Google has online data and Apple has on-device data, OpenAI lacks a direct feed into a user's physical interactions. Developing hardware, like an AirPod-style device, is a strategic move to capture this missing "personal context" of real-world experiences, opening a new competitive front.

Human cognition is a full-body experience, not just a brain function. Current AIs are 'disembodied brains,' fundamentally limited by their lack of physical interaction with the world. Integrating AI into robotics is the necessary next step toward more holistic intelligence.

Figure chose to develop its AI systems in-house rather than rely on its partnership with OpenAI. The reason was that its own team proved superior at the highly specialized task of designing, embedding, and running models on physical robot hardware, a challenge distinct from training purely digital LLMs.

The viral popularity of a simple, Raspberry Pi-based AI companion demonstrates user desire to interact with agents without using a phone. This points to a market for dedicated hardware that offers a more immediate, voice-first, and character-driven experience than a chat app.

The ultimate winner in the AI race may not be the most advanced model, but the most seamless, low-friction user interface. Since most queries are simple, the battle is shifting to hardware that is 'closest to the person's face,' like glasses or ambient devices, where distribution is king.

The true evolution of voice AI is not just adding voice commands to screen-based interfaces. It's about building agents so trustworthy they eliminate the need for screens for many tasks. This shift from hybrid voice/screen interaction to a screenless future is the next major leap in user modality.

AI agents move beyond simple command-response when embedded in ambient hardware like smart speakers. By passively hearing daily conversations and environmental cues, they gain the context needed for proactive, truly helpful interventions.

The technical friction of setting up AI agents creates a market for dedicated hardware solutions that abstract away complexity, much like Sonos did for home audio, making powerful AI accessible to non-technical users.

The evolution from simple voice assistants to 'omni intelligence' marks a critical shift where AI not only understands commands but can also take direct action through connected software and hardware. This capability, seen in new smart home and automotive applications, will embed intelligent automation into our physical environments.

Current devices like phones and computers were designed before the advent of human-like AI and are not optimized for it. Figure's founder argues that this creates a massive opportunity for a new class of hardware, including language devices and humanoids, which will eventually replace today's dominant form factors.