Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

AI research involves exploring a dependency graph where ideas may fail (stochastic). This contrasts with software engineering's more deterministic path. Success requires "research taste"—an intuition for navigating this uncertainty, a skill often honed in PhD programs.

Related Insights

A major bottleneck in AI progress is the gap between research and production. Researchers produce powerful models but often lack software engineering discipline. This results in code that is not portable, extensible, or robust, hindering the transition from a novel idea to a scalable, reliable product.

Unlike traditional deterministic products, AI models are probabilistic; the same query can yield different results. This uncertainty requires designers, PMs, and engineers to align on flexible expectations rather than fixed workflows, fundamentally changing the nature of collaboration.

Unlike traditional engineering, breakthroughs in foundational AI research often feel binary. A model can be completely broken until a handful of key insights are discovered, at which point it suddenly works. This "all or nothing" dynamic makes it impossible to predict timelines, as you don't know if a solution is a week or two years away.

Traditional software relies on predictable, deterministic functions. AI agents introduce a new paradigm of "stochastic subroutines," where correctness and logic are abdicated. This means developers must design systems that can achieve reliable outcomes despite the non-deterministic paths the AI might take to get there.

We don't fully understand how advanced AI models work. Creators don't program them with explicit knowledge but train them on vast datasets and then run experiments to discover their capabilities. This makes AI development more of a science—studying an unpredictable artifact—than traditional engineering, highlighting an inherent lack of control.

Current LLM agents are effective at executing and optimizing experiments within a defined research track, like hyperparameter tuning. However, they lack the crucial scientific skill of 'lateral thinking'—recognizing when a research path is a dead end and strategically pivoting to a fundamentally new approach.

In a new technological wave like AI, a high project failure rate is desirable. It indicates that a company is aggressively experimenting and pushing boundaries to discover what provides real value, rather than being too conservative.

Unlike traditional software development that starts with unit tests for quality assurance, AI product development often begins with 'vibe testing.' Developers test a broad hypothesis to see if the model's output *feels* right, prioritizing creative exploration over rigid, predefined test cases at the outset.

A major frontier for AI in science is developing 'taste'—the human ability to discern not just if a research question is solvable, but if it is genuinely interesting and impactful. Models currently struggle to differentiate an exciting result from a boring one.

Unlike weak-link problems (e.g., food safety) where you fix the worst part, science is a strong-link problem where progress depends entirely on the best outcomes. The optimal strategy is therefore to increase variance by funding more weird, high-risk ideas.