Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The once-critical problem of AI hallucinations has been dramatically reduced. Current frontier models are now more reliable in this regard than human junior associates, making them viable for professional legal work, contrary to popular belief.

Related Insights

When discussing AI risks like hallucinations, former Chief Justice McCormack argues the proper comparison isn't a perfect system, but the existing human one. Humans get tired, biased, and make mistakes. The question isn't whether AI is flawless, but whether it's an improvement over the error-prone reality.

While guardrails in prompts are useful, a more effective step to prevent AI agents from hallucinating is careful model selection. For instance, using Google's Gemini models, which are noted to hallucinate less, provides a stronger foundational safety layer than relying solely on prompt engineering with more 'creative' models.

While they still make mistakes and lack access to some databases, frontier models like Claude and GPT are already superior to the average human lawyer in terms of pure cognitive ability and legal analysis. The hosts believe this capability gap will only widen.

When multiple AI agents work as an ensemble, they can collectively suppress hallucinations. By referencing a shared knowledge graph as ground truth, the group can form a consensus, effectively ignoring the inaccurate output from one member and improving overall reliability.

AI models are brilliant but lack real-world experience, much like new graduates. This framing helps manage expectations by accounting for phenomena like hallucinations, which are akin to a smart but naive person confidently making things up without experiential wisdom.

For applications in banking, insurance, or healthcare, reliability is paramount. Startups that architect their systems from the ground up to prevent hallucinations will have a fundamental advantage over those trying to incrementally reduce errors in general-purpose models.

AI's occasional errors ('hallucinations') should be understood as a characteristic of a new, creative type of computer, not a simple flaw. Users must work with it as they would a talented but fallible human: leveraging its creativity while tolerating its occasional incorrectness and using its capacity for self-critique.

The tendency for AI models to "make things up," often criticized as hallucination, is functionally the same as creativity. This trait makes computers valuable partners for the first time in domains like art, brainstorming, and entertainment, which were previously inaccessible to hyper-literal machines.

While AI "hallucinations" grab headlines, the more systemic risk is lawyers becoming overly reliant on AI and failing to perform due diligence. The LexisNexis CEO predicts an attorney will eventually lose their license not because the AI failed, but because the human failed to properly review the work.

An OpenAI paper argues hallucinations stem from training systems that reward models for guessing answers. A model saying "I don't know" gets zero points, while a lucky guess gets points. The proposed fix is to penalize confident errors more harshly, effectively training for "humility" over bluffing.

Frontier AI Models Now Hallucinate Less Than Competent Junior Legal Associates | RiffOn