Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Alex Karp differentiates between hard-coded infrastructure and the 'magical' but limited code generated by LLMs. While LLMs excel at creating probabilistic outputs like dashboards, this 'free code' cannot function as a reliable knowledge store for critical enterprise processes. True enterprise solutions require managed, structured code that understands the world, a feat LLMs alone cannot achieve.

Related Insights

Current LLMs are intelligent enough for many tasks but fail because they lack access to complete context—emails, Slack messages, past data. The next step is building products that ingest this real-world context, making it available for the model to act upon.

LLMs shine when acting as a 'knowledge extruder'—shaping well-documented, 'in-distribution' concepts into specific code. They fail when the core task is novel problem-solving where deep thinking, not code generation, is the bottleneck. In these cases, the code is the easy part.

Salesforce's AI Chief warns of "jagged intelligence," where LLMs can perform brilliant, complex tasks but fail at simple common-sense ones. This inconsistency is a significant business risk, as a failure in a basic but crucial task (e.g., loan calculation) can have severe consequences.

Don't give LLMs full control. Use deterministic code for core logic, validation, and enforcing rules. Delegate only tasks requiring flexibility or understanding of unstructured input to the LLM, treating it as a specialized component, not the entire system.

Tools are emerging that don't just build an app but run the entire company—managing marketing, bookkeeping, and legal. This evolution shows the value is not in the LLM itself but in the 'harness' built around it to orchestrate complex business functions, creating a new category of fully autonomous company builders.

An 'LLM-first' approach, where the model handles core logic, creates impressive demos but lacks production reliability. A 'code-first' approach, using code for structure and LLMs for specific tasks, is less flashy but proves robust and debuggable in real-world applications.

AI coding's true enterprise value is limited because models struggle with legacy systems. Companies run on trillions of lines of mediocre code in old languages like COBOL—a problem that requires human intervention over decades, not a simple AI solution, which limits immediate, real-world impact.

Building reliable AI agents for finance, where accuracy is critical, requires moving beyond pure LLMs. Xero uses a hybrid system combining LLM-driven workflows with programmatic code and deep domain knowledge to ensure control and reliability that LLMs inherently lack.

AI can generate code, but the real value of enterprise software is its integration into complex human workflows, the massive costs of change management, and network effects. These human-centric problems create a durable moat that code generation alone cannot overcome.

Beyond API integrations, LLMs face significant hurdles in enterprise settings. They struggle to follow complex instructions reliably, can't yet interact with legacy graphical UIs effectively, and are stymied by the absence of clean, centralized knowledge bases, instead facing scattered 'tribal knowledge.'