Exposing a full API via the Model Context Protocol (MCP) overwhelms an LLM's context window and reasoning. This forces developers to abandon exposing their entire service and instead manually craft a few highly specific tools, limiting the AI's capabilities and defeating the "do anything" vision of agents.

Related Insights

Making an API usable for an LLM is a novel design challenge, analogous to creating an ergonomic SDK for a human developer. It's not just about technical implementation; it requires a deep understanding of how the model "thinks," which is a difficult new research area.

To avoid overwhelming an LLM's context with hundreds of tools, a dynamic MCP approach offers just three: one to list available API endpoints, one to get details on a specific endpoint, and one to execute it. This scales well but increases latency and complexity due to the multiple turns required for a single action.

When building Spiral, a single large language model trying to both interview the user and write content failed due to "context rot." The solution was a multi-agent system where an "interviewer" agent hands off the full context to a separate "writer" agent, improving performance and reliability.

The early focus on crafting the perfect prompt is obsolete. Sophisticated AI interaction is now about 'context engineering': architecting the entire environment by providing models with the right tools, data, and retrieval mechanisms to guide their reasoning process effectively.

Instead of giving an LLM hundreds of specific tools, a more scalable "cyborg" approach is to provide one tool: a sandboxed code execution environment. The LLM writes code against a company's SDK, which is more context-efficient, faster, and more flexible than multiple API round-trips.

Using a composable, 'plug and play' architecture allows teams to build specialized AI agents faster and with less overhead than integrating a monolithic third-party tool. This approach enables the creation of lightweight, tailored solutions for niche use cases without the complexity of external API integrations, containing the entire workflow within one platform.

Documentation is shifting from a passive reference for humans to an active, queryable context for AI agents. Well-structured docs on internal APIs and class hierarchies become crucial for agent performance, reducing inefficient and slow context window stuffing for faster code generation.

The perceived limits of today's AI are not inherent to the models themselves but to our failure to build the right "agentic scaffold" around them. There's a "model capability overhang" where much more potential can be unlocked with better prompting, context engineering, and tool integrations.

The developer abstraction layer is moving up from the model API to the agent. A generic interface for switching models is insufficient because it creates a 'lowest common denominator' product. Real power comes from tightly binding a specific model to an agentic loop with compute and file system access.

Salesforce's Chief AI Scientist explains that a true enterprise agent comprises four key parts: Memory (RAG), a Brain (reasoning engine), Actuators (API calls), and an Interface. A simple LLM is insufficient for enterprise tasks; the surrounding infrastructure provides the real functionality.

Today's LLMs Can't Handle Full APIs, Forcing Hand-Crafted MCP Tools | RiffOn