General Intuition Uses Gaming Data to Create AIs That 'Play the World'

Related Insights

Internet Video Is the Best Foundational Training Data for Generalist Robots

To build generalist robots, the most effective approach is pre-training foundation models on internet-scale video datasets, not just simulation or tele-operated data. This vast, diverse data provides a deep, implicit understanding of physics and object interaction that is impossible to replicate in controlled environments, enabling true generalization.

Nvidia Invests in Thinking Machines, Meta Acquires Moltbook, BYD F1 | Olivia Moore, David Paffenholz, Adam Goldstein, Max Junestrand, Allan McLennan, Jagdeep Singh, Scott Hickle

TBPN·4 months ago

Building a Generalist Robot Brain May Be Easier Than Creating Specialized Ones

The Physical Intelligence thesis is that a foundation model learning from diverse data can achieve a "physical understanding" of the world, making it easier to adapt to new tasks than building single-purpose robots from scratch. Generality leverages broader data, which is ultimately a more scalable approach.

Sergey Levine - Building LLMs for the Physical World - [Invest Like the Best, EP.465]

Invest Like the Best with Patrick O'Shaughnessy·3 months ago

GPT-4 API Powers a Top-Tier Minecraft Bot, Proving LLMs Can Replace Robotic Control Systems

A bot that plays Minecraft by generating text prompts for the GPT-4 API has become a best-in-class robotic planning system. This novel approach suggests that specialized, standalone planning systems for robots could be replaced by interacting with a general-purpose LLM.

AI Will Save The World with Marc Andreessen and Martin Casado

The a16z Show·6 months ago

General Intuition's Robotics Strategy Focuses on Robots Controllable by Game Inputs

GI is not trying to solve robotics in general. Their strategy is to focus on robots whose actions can be mapped to a game controller. This constraint dramatically simplifies the problem, allowing their foundation models trained on gaming data to be directly applicable, shifting the burden for robotics companies from expensive pre-training to more manageable fine-tuning.

World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI

Latent Space: The AI Engineer Podcast·7 months ago

World Models That Grasp Physics Are the Successor to LLMs

Large Language Models are limited because they lack an understanding of the physical world. The next evolution is 'World Models'—AI trained on real-world sensory data to understand physics, space, and context. This is the foundational technology required to unlock physical AI like advanced robotics.

Humanize AI before it dehumanizes us, with Dr. Rana el Kaliouby at SXSW

Masters of Scale·3 months ago

Game Data Surpasses YouTube for Training Spatial Reasoning by Simulating Embodied Action

GI's founder argues game footage is a superior data source for spatial reasoning compared to real-world videos. Gaming directly links visual perception to hand-eye motor control ("simulating optical dynamics with your hand"), avoiding the information loss inherent in interpreting passive video, which requires solving for pose estimation and inverse dynamics.

World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI

Latent Space: The AI Engineer Podcast·7 months ago

General Intuition Creates Cleaner Training Data by Logging Abstract Actions, Not Keystrokes

To protect user privacy, GI's system translates raw keyboard inputs (e.g., 'W' key) into their corresponding in-game actions (e.g., 'move forward'). This privacy-by-design approach has a key ML benefit: it removes noisy, user-specific key bindings and provides a standardized, canonical action space for training more generalizable agents.

World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI

Latent Space: The AI Engineer Podcast·7 months ago

Computer Screen Recordings Are the Ideal Pre-Training Data for Action-Oriented AI Agents

The computer serves as a universal actuator for human work across diverse environments. This makes screen recordings an existing, large-scale dataset perfectly suited for pre-training base models for agency. This approach aims to create a foundational model for action by replicating human input (keystrokes, mouse moves) and output.

Meet the New Thiel Fellows, GPT 5.5, Thoma Bravo Loses Medallia | Victor Boyd, Alex Shieh, Nick Dobroshinsky, Ishan Gupta, Antoni Kiszka, Milan Lustig, Galen Mead, Aubrey Niederhoffer, Samuel Carvalho, Claire Wang

TBPN·2 months ago

Comma AI Trains its Driving Agent in a Generative AI 'World Model'

Instead of using traditional, rule-based simulators, Comma AI trains its driving agent inside a learned "world model." This generative model creates photorealistic, diverse driving scenarios and, crucially, responds accurately to the agent's simulated actions—a key requirement for effective robotics training.

Open Source Self-Driving with Comma AI

Practical AI·2 months ago

Robotics Agents Can Transfer Skills from Games by Leveraging a Shared 'Game Controller' Interface

Intuition Robotics' core bet is that the transfer from simulated to physical worlds is unlocked by a shared action interface. Since many real-world robots like drones and arms are already operated with game controllers, an agent trained in diverse gaming environments only needs to adapt to a new visual world, not an entirely new action space.

Google’s AI Breakthrough in Cancer, Protein Powders Exposed | Marc Benioff, Eiso Kant, Dante Vaisbort, Alice Bentinck, Eric Seufert, Pim de Witte

TBPN·8 months ago

Get your free personalized podcast brief

Related Insights