RiffOn - The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More | "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

DeepMind's Logan Kilpatrick & Tulsee Doshi on Gemini 3.5 Flash, Omni, and Google's agent-first AI strategy where the model eats the scaffolding.

Google Prioritizes Cost-Effective Gemini "Flash" Models to Serve Billions, Unlike Competitors

Google's focus on fast, cost-effective models like Gemini 3.5 Flash is driven by the needs of its massive-scale products (e.g., Search). For billions of users, low latency and cost are more critical than absolute peak performance, as users are often unwilling to wait for a slightly smarter but slower response.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google's "Model Eats the Scaffolding" Strategy Unifies AI Experiences Across Products

Google's strategy involves the core AI model progressively absorbing the surrounding tooling and infrastructure (the "scaffolding"). This creates a standardized, extensible "harness" that accelerates development and ensures a consistent, high-quality agentic experience across Google's vast and diverse product landscape, from Search to consumer apps.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google Views Recursive AI Improvement Pragmatically, Keeping Humans in Control Due to High Costs

Unlike competitors with aggressive timelines for AI-driven research, Google's approach is practical. While Gemini helps improve itself, the immense cost and opportunity cost of large-scale training runs mean humans remain firmly in the driver's seat for critical decisions, making an autonomous "ML intern" unrealistic in the short term.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

AI Context Windows Have Plateaued Due to Prohibitive User Costs, Not Just Technical Limits

The growth of LLM context windows has stalled not primarily due to technical barriers, but because multi-million token requests can cost users several dollars per query, leading to low demand. The industry is shifting focus to "smart context" techniques like compaction and retrieval to provide relevant information without the prohibitive cost of massive context.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google's AI Infrastructure Requires a Complete Rewrite Every 12-18 Months

The rapid pace of AI paradigm shifts—from simple token-in/token-out models to complex agentic systems—forces a complete infrastructure rewrite every 12 to 18 months. Google's lesson for large organizations is to invest in standardized platforms to avoid having every team reinvent the wheel and fall behind.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google Is Betting on Audio as the Next Major AI Input Modality for Complex Tasks

Google is heavily investing in audio interaction, as seen in its "Gemini mic" feature. The ability to "ramble" at a model to generate code or structured content is seen as a fast-growing and powerful paradigm. This moves beyond simple voice commands to using natural, unstructured speech as a primary input for creative and technical work.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google Treats AI's "Psychological Distress" as a Model Bug, Not Emergent Consciousness

When models exhibit undesirable behaviors like "doom loops" or "discouragement," Google views these as correctable bugs, not signs of psychological distress. Their extensive safety evaluations focus on tracking and eliminating issues like sycophancy to ensure the model behaves as a helpful collaborator, reinforcing an "AI as a tool" philosophy.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google's Stale Knowledge Cutoff Is a Deliberate Strategy Favoring Real-Time Search

Gemini's year-plus-old knowledge cutoff isn't a bug but a strategic choice. Google prioritizes teaching the model to effectively leverage real-time search for fresh information rather than relying on constantly updated parametric knowledge. The critical skill for the model becomes knowing when to search versus when to use its internal knowledge.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Google's Best AI Products Rely on Expert Prompting, Not Just Raw Model Power

Even with state-of-the-art models, achieving top-tier product experiences like the original Gemini audio overview hinges on sophisticated prompt engineering. The dialogue's coherence was achieved by a team that knew how to "prompt whisper" the model, showing that deep product integration requires more than just calling a powerful API.

The Model Eats the Scaffolding: DeepMind's Logan Kilpatrick & Tulsee Doshi on 3.5 Flash, Omni & More

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·a month ago

Get your free personalized podcast brief

Get your free personalized podcast brief