We scan new podcasts and send you the top 5 insights daily.
To mitigate data variations caused by running experiments on different days (batch effects), Noetik employs a sophisticated arraying strategy. They take dozens of samples from a single tumor and distribute them across multiple, randomized arrays, ensuring each patient is represented in different batches for robust calibration and model training.
By using foundation models to analyze vast datasets, companies can create a synthetic 'standard of care' arm for single-arm Phase 1 trials. The AI matches patients based on deep clinical and genomic parameters, providing insights comparable to a much larger Phase 3 trial.
Noetik's core thesis is that the 95% failure rate in cancer trials isn't due to bad drug design, but an inability to identify the correct patient sub-population. Their models aim to solve this patient selection problem from the outset, rescuing potentially effective drugs.
To avoid overfitting and prove true generalization, Bolts validates its protein design models by testing them across a wide array of targets from over 25 external academic and industry labs. This diverse, real-world testing is the ultimate benchmark of a model's utility in drug discovery.
Drawing an analogy from neuroscience, Noetik argues for a top-down modeling approach. Instead of building a perfect simulation of a single cell and scaling up, they model the functional interactions at the tissue level first. This abstraction is more likely to predict patient-level outcomes, which is the ultimate goal.
A major cause of clinical trial failure is that preclinical testing uses immortalized cancer cell lines cultured for decades. These cells have abnormal genomes and gene expressions that don't represent actual tumors, creating a massive translational gap that Noetik's patient-derived data aims to solve.
To bridge the gap between animal models and human trials, Noetik trains models on its human data and then runs inference on mouse histology (H&E) images. This allows them to predict human-relevant biology and gene expression directly from the mouse model, overcoming a key translational hurdle in drug development.
Demonstrating extreme conviction, Noetik invested a year and a half in lab setup, tumor sourcing, and data processing before having a dataset large enough to train its first models. This highlights the immense upfront investment and risk required for a data-first approach in bio-AI, where no off-the-shelf data exists.
While Noetik's models are trained on complex, multimodal data like spatial transcriptomics, they are designed to run inference using only standard, ubiquitous H&E pathology slides. This creates a highly scalable and practical path to a clinical diagnostic without requiring expensive, novel assays for every patient.
Unlike cell-line derived (CDX) models, PDX models are grown directly from patient samples without a culture phase. This preserves the original tumor's heterogeneity, leading to more clinically relevant and predictive data in preclinical radiopharmaceutical studies.
Unlike general AI which leverages vast, existing datasets, Noetik believes progress in biology requires designing and generating specific, high-quality data with foresight into the models that will be trained. They compare this to the intentional, decades-long creation of the PDB dataset for protein folding.