Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Unlearn.ai found that scaling digital twins from CNS to oncology isn't about parameter changes. Radically different data structures—like oncology's hierarchy of rare diseases and complex treatment histories—demand entirely new modeling approaches, unlike the more siloed data found in CNS trials.

Related Insights

A significant part of Unlearn.ai's value is not just its advanced generative models, but its painstaking data harmonization work. The company builds internal machine learning tools to unify complex, disparate data sources like clinical trials and real-world data, which is the essential foundation for creating powerful models.

Unlike image recognition or NLP, clinical trial data possesses a unique and complex mathematical geometry. According to Dr. Juraji, this means generic AI models are insufficient. Solving trial failures requires specialized AI built to navigate this specific, difficult data landscape.

Unlearn.ai's method for late-phase trials (PROCOVA) is acceptable to regulators because it's designed to statistically correct for any bias in the digital twin model. This ensures the model's inaccuracy doesn't affect the trial's final decision procedure or error rate, a critical feature distinguishing it from simply replacing the control arm.

Instead of the high-risk approach of replacing a trial's control arm with digital twins, Unlearn.ai adds counterfactual data to every participant. This method increases a trial's statistical power, allowing for smaller control arms or a higher chance of success, while satisfying regulatory constraints for pivotal trials.

Numenos AI found that unifying biological data without traditional borders, such as incorporating mouse data or cancer data for dermatological diseases, surprisingly increases the predictive accuracy of their models. This challenges the siloed approach to traditional research.

Simply scaling models on internet data won't solve specialized problems like curing cancer or discovering materials. While scaling laws hold for in-domain tasks, the model must be optimized against the specific data distribution it needs to learn from—which for science, requires generating new experimental data.

While AI excels where large, clean datasets exist (like protein folding), it struggles with modeling slow, progressive diseases like Alzheimer's or obesity. These are organ-level phenomena, and the necessary data doesn't exist yet. In vivo platforms are critical for generating this required foundational data.

The progress of AI in predicting cancer treatment is stalled not by algorithms, but by the data used to train them. Relying solely on static genetic data is insufficient. The critical missing piece is functional, contextual data showing how patient cells actually respond to drugs.

It's impossible to generate human data at the scale of in silico experiments. The key is to create highly accurate simulations of human physiology (digital twins) and then validate their predictions with limited, strategic human data. If the model proves reliable, it could drastically accelerate R&D.

Unlearn.ai strategically avoids diseases where a single biomarker determines progression. Instead, they focus on complex, systematic diseases where many variables each have a small impact on the outcome. These are the areas where sophisticated, multi-variable modeling provides the most significant advantage over standard statistical adjustment.