/
© 2026 RiffOn. All rights reserved.
  1. Super Data Science: ML & AI Podcast with Jon Krohn
  2. 959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)
959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn · Jan 20, 2026

Agentic AI 101: Sinan Ozdemir demystifies agents vs. workflows, LLM selection, evaluation metrics, and optimization trade-offs.

Quantized LLMs Are "Cousins," Not Clones, of the Original Model

Quantization and distillation don't simply create a smaller version of an LLM. These optimization processes alter the model's behavior to the point where it becomes a new entity—a "cousin." It may be legible and functional, but it will not produce the same outputs as the original.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Choose Agents Over Workflows When Existing Manual Processes Have Many Conditional Branches

To decide between a deterministic workflow and a flexible agent, analyze the current manual process. If the task involves numerous 'if-then' conditions and decision points, an agentic system is likely the more maintainable and effective solution.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Increased LLM Reasoning Time Shows No Obvious Correlation With Better Task Performance

Benchmarking reasoning models revealed no clear correlation between the level of reasoning and an LLM's performance. In fact, even when there is a slight accuracy gain (1-2%), it often comes with a significant cost increase, making it an inefficient trade-off.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Differentiate AI Systems by Agency: Workflows are Deterministic, Agents Choose Their Own Tools

An AI agent uses an LLM with tools, giving it agency to decide its next action. In contrast, a workflow is a predefined, deterministic path where the LLM's actions are forced. Most production AI systems are actually workflows, not true agents.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

OpenAI's Deep Research Uses a Hybrid "Agentic Workflow" to Mitigate Risk Before Execution

Purely agentic systems can be unpredictable. A hybrid approach, like OpenAI's Deep Research forcing a clarifying question, inserts a deterministic workflow step (a "speed bump") before unleashing the agent. This mitigates risk, reduces errors, and ensures alignment before costly computation.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Large LLM Context Windows Don't Guarantee Recall; Models Often Fail "Needle in the Haystack" Tests

Simply having a large context window is insufficient. Models may fail to "see" or recall specific facts embedded deep within the context, a phenomenon exposed by "needle in the haystack" evaluations. Effective reasoning capability across the entire window is a separate, critical factor.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Use Autoencoding "Reader" LLMs like BERT for Non-Generative Tasks to Drastically Reduce Model Size

Autoencoding models (e.g., BERT) are "readers" that fill in blanks, while autoregressive models (e.g., GPT) are "writers." For non-generative tasks like classification, a tiny autoencoding model can match the performance of a massive autoregressive one, offering huge efficiency gains.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago

Select LLM Size by Task Tier: Small (<10B) for Retrieval, Medium (10-100B) for Agents, Large (100B+) for Enterprise

Use a tiered approach for model selection based on parameter count. Models under 10B are for simple tasks like RAG. The 10-100B range is the sweet spot for agentic systems. Models over 100B parameters are for complex, multi-lingual, enterprise-wide deployments.

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) thumbnail

959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

Super Data Science: ML & AI Podcast with Jon Krohn·a month ago