Noetik Found Autoregressive Models Excel on Biological Data Only at Longer Context Lengths

Related Insights

Boost Biology AI Accuracy By Massively Sampling and Then Ranking Results

A key strategy for improving results from generative protein models is "inference-time scaling." This involves generating a vast number of potential structures and then using a separate, fine-tuned scoring model to rank them. This search-and-rank process uncovers high-quality solutions the model might otherwise miss.

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

Latent Space: The AI Engineer Podcast·5 months ago

In-Context Learning May Be a Form of Internal Gradient Descent

Contrary to the view that in-context learning is a distinct process from training, Karpathy speculates it might be an emergent form of gradient descent happening within the model's layers. He cites papers showing that transformers can learn to perform linear regression in-context, with internal mechanics that mimic an optimization loop.

Andrej Karpathy — AGI is still a decade away

Dwarkesh Podcast·9 months ago

Transformer Models Natively Operate on Sets, Not Sequences

A common misconception is that Transformers are sequential models like RNNs. Fundamentally, they are permutation-equivariant and operate on sets of tokens. Sequence information is artificially injected via positional embeddings, making the architecture inherently flexible for non-linear data like 3D scenes or graphs.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·8 months ago

"Context Rot" Degrades AI Quality; Bigger Context Windows Aren't Better

Even models with million-token context windows suffer from "context rot" when overloaded with information. Performance degrades as the model struggles to find the signal in the noise. Effective context engineering requires precision, packing the window with only the exact data needed.

951: Context Engineering, Multiplayer AI and Effective Search, with Dropbox’s Josh Clemm

Super Data Science: ML & AI Podcast with Jon Krohn·7 months ago

Transformers Are Fundamentally Set Models, Not Sequence Models

The core transformer architecture is permutation-equivariant and operates on sets of tokens, not ordered sequences. Sequentiality is an add-on via positional embeddings, making transformers naturally suited for non-linear data structures like 3D worlds, a concept many practitioners overlook.

What Comes After ChatGPT? The Mother of ImageNet Predicts The Future

a16z Podcast·8 months ago

Specialized Architectures Still Beat Transformers for Protein Structure Prediction

Contrary to trends in other AI fields, structural biology problems are not yet dominated by simple, scaled-up transformers. Specialized architectures that bake in physical priors, like equivariance, still yield vastly superior performance, as the domain's complexity requires strong inductive biases.

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

Latent Space: The AI Engineer Podcast·5 months ago

The Richest Biological Data Comes from Studying Perturbed Systems Over Time

To truly understand biological systems, data scale is less important than data quality. The most informative data comes from capturing the dynamic interactions of a system *while* it's being perturbed (e.g., by a drug), not from static snapshots of a system at rest.

Alicia Zhou: The Dark Matter for Cancer Immunotherapy Translation

Behind the Breakthroughs·5 months ago

Large LLM Context Windows Don't Guarantee Recall; Models Often Fail "Needle in the Haystack" Tests

Simply having a large context window is insufficient. Models may fail to "see" or recall specific facts embedded deep within the context, a phenomenon exposed by "needle in the haystack" evaluations. Effective reasoning capability across the entire window is a separate, critical factor.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·6 months ago

Biology AI's Next Leap Requires Causal Data, Not Just More Sequences

While petabytes of observational DNA sequence data exist, it's insufficient for the next wave of AI. The key to creating powerful, functional models is generating causal data—from experiments that systematically test function—which is a current data bottleneck.

Bioinfohazards: Jassi Pannu on Controlling Dangerous Data from which AI Models Learn

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·4 months ago

True Continual Learning Requires "Nested" Architectures with Varied Memory Update Speeds

The key to continual learning is not just a longer context window, but a new architecture with a spectrum of memory types. "Nested learning" proposes a model with different layers that update at different frequencies—from transient working memory to persistent core knowledge—mimicking how humans learn without catastrophic forgetting.

AI 2025 → 2026 Live Show | Part 1

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·7 months ago

Get your free personalized podcast brief

Related Insights