We scan new podcasts and send you the top 5 insights daily.
While problems like protein folding are NP-hard in theory, the instances found in nature have structural properties that allow for efficient solutions. Real-world cases of NP-hard problems aren't the adversarial, worst-case scenarios used in complexity proofs, explaining the gap between theory and practice.
The success of neural networks on problems like Go and protein folding, long considered intractable NP-hard problems, is profound. It suggests our formal understanding of computational hardness, which focuses on worst-case scenarios, may be an incomplete model for how to find useful, approximate solutions in practice.
Difficult challenges often remain unsolved because they are consistently approached with the same tools and viewpoints. True progress requires introducing a novel perspective, a new tool, or temporarily shifting focus to a more tractable problem.
DE Shaw Research (DESRES) invested heavily in custom silicon for molecular dynamics (MD) to solve protein folding. In contrast, DeepMind's AlphaFold, using ML on experimental data, solved it on commodity hardware. This demonstrates data-driven approaches can be vastly more effective than brute-force simulation for complex scientific problems.
Models like AlphaFold don't solve protein folding from physics alone. They heavily rely on co-evolutionary data, where correlated mutations across species provide strong hints about which amino acids are physically close. This dramatically constrains the search space for the final structure.
With directed evolution, scientists find a mutated enzyme that works without knowing why. Even with the "answer"—the exact genetic changes—the complexity of protein interactions makes it incredibly difficult to reverse-engineer the underlying mechanism. The solution often precedes the understanding.
Humanity's intellectual pursuits, from science to engineering, inherently focus on problems where a potential solution can be verified upon discovery. We wouldn't begin searching for something if we couldn't recognize it once found, which is the definition of an NP problem.
An anecdote about a "wonky" BindCraft design with disconnected beta sheets, which experts predicted would fail, highlights a key trend. The resulting binder was one of the best ever produced, suggesting AI models are extracting structural principles that go beyond traditional human "protein literacy" and intuition.
While any NP-complete problem can be reduced to another, SAT solvers are the practical choice because of the immense effort poured into developing heuristics that efficiently handle the structured instances arising in real-world applications. Their advantage lies in engineering, not pure theory.
John Jumper uses an analogy to explain the leap in complexity from prediction to design. Predicting a protein's structure is like recognizing a bicycle's parts. Designing a new, functional protein is like building a working bicycle—requiring every detail to be correct.
AlphaFold 2 was a breakthrough for predicting single protein structures. However, this success highlighted the much larger, unsolved challenges of modeling protein interactions, their dynamic movements, and the actual folding process, which are critical for understanding disease and drug discovery.