Numenos AI found that unifying biological data without traditional borders, such as incorporating mouse data or cancer data for dermatological diseases, surprisingly increases the predictive accuracy of their models. This challenges the siloed approach to traditional research.
Contrary to the belief that AI requires perfect, clean data, the biggest opportunity lies in building technology that can find signals in messy, diverse data sets across different modalities and organisms. The tech should solve the data problem, not wait for it to be solved.
Pharmaceutical companies structure deals around specific drug assets with clear milestones. They lack established business models for collaborating with AI companies offering platform technologies, creating a significant hurdle for tech bio startups seeking partnership.
Instead of the traditional lab-to-clinic pipeline, a "reverse translation" approach uses AI to analyze data from patients who fail standard-of-care treatments. This identifies the specific unmet need and biological target first, guiding subsequent lab research for higher success rates.
A major misconception is that general-purpose Large Language Models (LLMs) can be readily applied to complex biological problems. Biological data, like RNA sequencing, constitutes a unique language that requires custom-built foundation models, not simply fine-tuning of existing LLMs.
By using foundation models to analyze vast datasets, companies can create a synthetic 'standard of care' arm for single-arm Phase 1 trials. The AI matches patients based on deep clinical and genomic parameters, providing insights comparable to a much larger Phase 3 trial.
Achieving explainability in AI for drug development isn't about post-hoc analysis. It requires building models from the ground up using inherently interpretable data like RNA sequencing and mutational profiles. When the inputs are explainable, the model's outputs become explainable by design.
