The progress of AI in predicting cancer treatment is stalled not by algorithms, but by the data used to train them. Relying solely on static genetic data is insufficient. The critical missing piece is functional, contextual data showing how patient cells actually respond to drugs.
Simple cell viability screens fail to identify powerful drug combinations where each component is ineffective on its own. AI can predict these synergies, but only if trained on mechanistic data that reveals how cells rewire their internal pathways in response to a drug.
In high-stakes fields like pharma, AI's ability to generate more ideas (e.g., drug targets) is less valuable than its ability to aid in decision-making. Physical constraints on experimentation mean you can't test everything. The real need is for tools that help humans evaluate, prioritize, and gain conviction on a few key bets.
The primary barrier to AI in drug discovery is the lack of large, high-quality training datasets. The emergence of federated learning platforms, which protect raw data while collectively training models, is a critical and undersung development for advancing the field.
An individual tumor can have hundreds of unique mutations, making it impossible to predict treatment response from a single genetic marker. This molecular chaos necessitates functional tests that measure a drug's actual effect on the patient's cells to determine the best therapy.
While AI can accelerate the ideation phase of drug discovery, the primary bottleneck remains the slow, expensive, and human-dependent clinical trial process. We are already "drowning in good ideas," so generating more with AI doesn't solve the fundamental constraint of testing them.
The effectiveness of an AI system isn't solely dependent on the model's sophistication. It's a collaboration between high-quality training data, the model itself, and the contextual understanding of how to apply both to solve a real-world problem. Neglecting data or context leads to poor outcomes.
Despite AI's power, 90% of drugs fail in clinical trials. John Jumper argues the bottleneck isn't finding molecules that target proteins, but our fundamental lack of understanding of disease causality, like with Alzheimer's, which is a biology problem, not a technology one.
Despite billions invested over 20 years in targeted and genome-based therapies, the real-world benefit to cancer patients has been minimal, helping only a small fraction of the population. This highlights a profound gap and the urgent need for new paradigms like functional precision oncology.
The bottleneck for AI in drug development isn't the sophistication of the models but the absence of large-scale, high-quality biological data sets. Without comprehensive data on how drugs interact within complex human systems, even the best AI models cannot make accurate predictions.
A major frustration in genetics is finding 'variants of unknown significance' (VUS)—genetic anomalies with no known effect. AI models promise to simulate the impact of these unique variants on cellular function, moving medicine from reactive diagnostics to truly personalized, predictive health.