We scan new podcasts and send you the top 5 insights daily.
Purely probabilistic LLMs are unreliable for critical business processes. GetVocal's architecture uses a deterministic "context graph" based on user intentions as the core decision-making engine. This provides traceability and reliability, while selectively calling generative models for conversational nuance.
To avoid AI hallucinations, Square's AI tools translate merchant queries into deterministic actions. For example, a query about sales on rainy days prompts the AI to write and execute real SQL code against a data warehouse, ensuring grounded, accurate results.
Fully autonomous agents are not yet reliable for complex production use cases because accuracy collapses when chaining multiple probabilistic steps. Zapier's CEO recommends a hybrid "agentic workflow" approach: embed a single, decisive agent within an otherwise deterministic, structured workflow to ensure reliability while still leveraging LLM intelligence.
The effectiveness of enterprise AI agents is limited not by data access, but by the absence of context for *why* decisions were made. 'Context graphs' aim to solve this by capturing 'decision traces'—exceptions, precedents, and overrides that currently live in Slack threads and employee's heads, creating a true source of truth for automation.
An AI agent uses an LLM with tools, giving it agency to decide its next action. In contrast, a workflow is a predefined, deterministic path where the LLM's actions are forced. Most production AI systems are actually workflows, not true agents.
When building Spiral, a single large language model trying to both interview the user and write content failed due to "context rot." The solution was a multi-agent system where an "interviewer" agent hands off the full context to a separate "writer" agent, improving performance and reliability.
To ensure reliability in healthcare, ZocDoc doesn't give LLMs free rein. It wraps them in a hybrid system where traditional, deterministic code orchestrates the AI's tasks, sets firm boundaries, and knows when to hand off to a human, preventing the 'praying for the best' approach common with direct LLM use.
The system ingests a company's knowledge bases to generate an initial "context graph." As the AI operates, it uses LLMs to explore new conversational patterns. Once a pattern becomes frequent, it's codified into the deterministic graph, making the system more efficient and reliable over time.
Relying solely on natural language prompts like 'always do this' is unreliable for enterprise AI. LLMs struggle with deterministic logic. Salesforce developed 'AgentForce Script,' a dedicated language to enforce rules and ensure consistent, repeatable performance for critical business workflows, blending it with LLM reasoning.
AI agents are simply 'context and actions.' To prevent hallucination and failure, they must be grounded in rich context. This is best provided by a knowledge graph built from the unique data and metadata collected across a platform, creating a powerful, defensible moat.
"Context Engineering" is the critical practice of managing information fed to an LLM, especially in multi-step agents. This includes techniques like context compaction, using sub-agents, and managing memory. Harrison Chase considers this discipline more crucial than prompt engineering for building sophisticated agents.