Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Biohub applies mechanistic interpretability to its protein language models. By analyzing the model's internal representations—learned from both known and unknown biology—researchers can uncover emergent biological principles. This turns the model from a black box predictor into an engine for scientific discovery itself.

Related Insights

AI is moving beyond simply identifying patterns in existing research papers. It is now able to extrapolate fundamental biological principles, enabling it to understand complex systems from the ground up, like the relationship between atoms, molecules, and proteins.

Just as biology deciphers the complex systems created by evolution, mechanistic interpretability seeks to understand the "how" inside neural networks. Instead of treating models as black boxes, it examines their internal parameters and activations to reverse-engineer how they work, moving beyond just measuring their external behavior.

By analyzing a model predicting Alzheimer's, Goodfire discovered it relied on the length of cell-free DNA fragments—a previously overlooked signal. This demonstrates how interpretability can extract new, testable scientific hypotheses from high-performing "black box" models.

The success of protein language models can be explained by Zellig Harris's 1954 linguistic theory. Just as a word's meaning is defined by its contexts, an amino acid's biological role is determined by the sequences it can appear in. The model learns this deep statistical structure, effectively learning biology.

We can now prove that LLMs are not just correlating tokens but are developing sophisticated internal world models. Techniques like sparse autoencoders untangle the network's dense activations, revealing distinct, manipulable concepts like "Golden Gate Bridge." This conclusively demonstrates a deeper, conceptual understanding within the models.

In partnership with institutions like Mayo Clinic, Goodfire applied interpretability tools to specialized foundation models. This process successfully identified new, previously unknown biomarkers for Alzheimer's, showcasing how understanding a model's internals can lead to tangible scientific breakthroughs.

Trained only on sequence prediction, ESM-C independently developed a hierarchical feature space mirroring decades of human scientific discovery. Its learned representations range from basic biochemical properties to complex, abstract functional concepts, all without prior biological knowledge.

Achieving explainability in AI for drug development isn't about post-hoc analysis. It requires building models from the ground up using inherently interpretable data like RNA sequencing and mutational profiles. When the inputs are explainable, the model's outputs become explainable by design.

Generate Biomedicines' AI learns the fundamental rules of protein structure and function, much like a language's grammar. This allows it to design entirely new proteins by generating novel "sentences" (sequences) that are biologically coherent and functional, rather than just mimicking existing ones found in nature.

EBMs analyze data to understand its underlying rules, storing this knowledge in inspectable 'latent variables' in the form of an energy landscape. This contrasts with LLMs, which are black boxes where the reasoning process is opaque. With EBMs, you can observe the model's internal state in real-time to see what it has learned.