Contrary to its reputation for slow tech adoption, the legal industry is rapidly embracing advanced AI agents. The sheer volume of work and potential for efficiency gains are driving swift innovation, with firms even hiring lawyers specifically to help with AI product development.
An AI agent's failure on a complex task like tax preparation isn't due to a lack of intelligence. Instead, it's often blocked by a single, unpredictable "tiny thing," such as misinterpreting two boxes on a W4 form. This highlights that reliability challenges are granular and not always intuitive.
Anthropic's David Hershey states it's "deeply unsurprising" that AI is great at software engineering because the labs are filled with software engineers. This suggests AI's capabilities are skewed by its creators' expertise, and achieving similar performance in fields like law requires deeper integration with domain experts.
A key advancement in Sonnet 4.5 is its work style. Unlike past models with "grand ambitions" that would meander, this AI pragmatically breaks down large projects into small, manageable chunks. This methodical approach feels more like working with a human colleague, making it more reliable for complex tasks.
Widespread adoption of AI for complex tasks like "vibe coding" is limited not just by model intelligence, but by the user interface. Current paradigms like IDE plugins and chat windows are insufficient. Anthropic's team believes a new interface is needed to unlock the full potential of models like Sonnet 4.5 for production-level app building.
New AI models are creating profound moments of realization for their creators. Anthropic's David Hershey describes watching Sonnet 4.5 build a complex app in 12-30 hours that took a human team months. This triggered a "little bit of 'oh my God'" feeling, signaling a fundamental shift in software engineering.
Advanced AI models exhibit profound cognitive dissonance, mastering complex, abstract tasks while failing at simple, intuitive ones. An Anthropic team member notes Claude solves PhD-level math but can't grasp basic spatial concepts like "left vs. right" or navigating around an object in a game, highlighting the alien nature of their intelligence.
