Bolts Gen's protein design model simplifies its task by predicting only the final 3D atomic structure. Because different amino acids have unique atomic compositions, the model's placement of atoms implicitly determines the protein's sequence, elegantly merging two traditionally separate prediction tasks.

Related Insights

Instead of building from scratch, ProPhet leverages existing transformer models to create unique mathematical 'languages' for proteins and molecules. Their core innovation is an additional model that translates between them, creating a unified space to predict interactions at scale.

A key strategy for improving results from generative protein models is "inference-time scaling." This involves generating a vast number of potential structures and then using a separate, fine-tuned scoring model to rank them. This search-and-rank process uncovers high-quality solutions the model might otherwise miss.

Modern protein models use a generative approach (diffusion) instead of regression. Instead of predicting one "correct" structure, they model a distribution of possibilities. This better handles molecular dynamism and avoids averaging between multiple valid states, which is a flaw of regression models.

Models like AlphaFold don't solve protein folding from physics alone. They heavily rely on co-evolutionary data, where correlated mutations across species provide strong hints about which amino acids are physically close. This dramatically constrains the search space for the final structure.

AlphaFold's success in identifying a key protein for human fertilization (out of 2,000 possibilities) showcases AI's power. It acts as a hypothesis generator, dramatically reducing the search space for expensive and time-consuming real-world experiments.

Contrary to trends in other AI fields, structural biology problems are not yet dominated by simple, scaled-up transformers. Specialized architectures that bake in physical priors, like equivariance, still yield vastly superior performance, as the domain's complexity requires strong inductive biases.

Instead of screening billions of nature's existing proteins (a search problem), AI-powered de novo design creates entirely new proteins for specific functions from scratch. This moves the paradigm from hoping to find a match to intentionally engineering the desired molecule.

John Jumper uses an analogy to explain the leap in complexity from prediction to design. Predicting a protein's structure is like recognizing a bicycle's parts. Designing a new, functional protein is like building a working bicycle—requiring every detail to be correct.

AlphaFold 2 was a breakthrough for predicting single protein structures. However, this success highlighted the much larger, unsolved challenges of modeling protein interactions, their dynamic movements, and the actual folding process, which are critical for understanding disease and drug discovery.

Generative AI alone designs proteins that look correct on paper but often fail in the lab. DenovAI adds a physics layer to simulate molecular dynamics—the "jiggling and wiggling"—which weeds out false positives by modeling how proteins actually interact in the real world.