Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Future progress in biology requires moving beyond static models. The new paradigm involves an AI that reasons over hypotheses, prioritizes experiments, learns from the empirical outcomes, and updates its internal world model. This creates a scalable, closed-loop system for scientific discovery.

Related Insights

The next major AI breakthrough will come from applying generative models to complex systems beyond human language, such as biology. By treating biological processes as a unique "language," AI could discover novel therapeutics or research paths, leading to a "Move 37" moment in science.

Molly Gibson's venture, Lila Sciences, aims for AI that doesn't just analyze data but autonomously executes the entire scientific method. By connecting generative models to automated labs, the AI can formulate hypotheses, run physical experiments, and learn from the results in a continuous loop, achieving a superhuman pace of discovery.

The next leap in biotech moves beyond applying AI to existing data. CZI pioneers a model where 'frontier biology' and 'frontier AI' are developed in tandem. Experiments are now designed specifically to generate novel data that will ground and improve future AI models, creating a virtuous feedback loop.

AI models are trained on large lab-generated datasets. The models then simulate biology and make predictions, which are validated back in the lab. This feedback loop accelerates discovery by replacing random experimental "walks" with a more direct computational route, making research faster and more efficient.

The ultimate goal isn't just modeling specific systems (like protein folding), but automating the entire scientific method. This involves AI generating hypotheses, choosing experiments, analyzing results, and updating a 'world model' of a domain, creating a continuous loop of discovery.

Applying AI to biology isn't just a big data problem. The training data must be structured for reinforcement learning. This means it must be complete (including negative results) and allow for a feedback loop where AI predictions are tested in the lab, and the results are used to refine the model.

Building biologically relevant AI is not a one-off process. It demands a continuous "lab in the loop" system where wet lab experiments generate proprietary data to train models, whose outputs are then physically tested in the lab. This iterative feedback cycle constantly refines the model's predictive accuracy.

Current LLMs fail at science because they lack the ability to iterate. True scientific inquiry is a loop: form a hypothesis, conduct an experiment, analyze the result (even if incorrect), and refine. AI needs this same iterative capability with the real world to make genuine discoveries.

While petabytes of observational DNA sequence data exist, it's insufficient for the next wave of AI. The key to creating powerful, functional models is generating causal data—from experiments that systematically test function—which is a current data bottleneck.

The founder of AI and robotics firm Medra argues that scientific progress is not limited by a lack of ideas or AI-generated hypotheses. Instead, the critical constraint is the physical capacity to test these ideas and generate high-quality data to train better AI models.