Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The initial assumption of a centralized AI model (large hub, large spoke) is wrong. The new model will involve large foundational hubs, enterprise-specific training hubs, and distributed "spokes" of on-premise hardware for inference. This shift is driven by the need for data control and cost efficiency.

Related Insights

The future of enterprise AI isn't choosing one provider. Instead, companies will use a "composable model" approach, routing queries to a combination of powerful frontier models and their own fine-tuned open-source models. This strategy, dubbed the "council of LLMs," optimizes for cost, performance, and specialization on proprietary data.

The long-standing trend of centralizing all data into a single warehouse is incompatible with the speed of AI. Large-scale data migrations are too slow. The future architecture will involve AI models operating closer to data sources for faster, decentralized operation.

While AI training is data-center-intensive, Cisco's CEO sees the move to AI inference as a massive growth opportunity. Inference will happen at distributed edge locations to be close to users, requiring robust, high-performance networks to connect everything, which plays directly into the company's core strengths.

The idea of a single orchestration hub is outdated. A more effective model is federated, where specialized agents (e.g., an agent that embodies brand guidelines 'as code') are exposed as reusable services. This allows different departments like sales, marketing, and HR to plug into the same expertise.

The current focus on building massive, centralized AI training clusters represents the 'mainframe' era of AI. The next three years will see a shift toward a distributed model, similar to computing's move from mainframes to PCs. This involves pushing smaller, efficient inference models out to a wide array of devices.

The AI hardware market will not be a winner-take-all landscape. Instead, it will evolve into a hybrid model where large, intelligent 'boss' models delegate tasks to smaller, specialized, high-speed 'worker' models. This creates a durable niche for specialized hardware like Cerebras, which can excel at speed-sensitive sub-tasks.

The "agentic revolution" will be powered by small, specialized models. Businesses and public sector agencies don't need a cloud-based AI that can do 1,000 tasks; they need an on-premise model fine-tuned for 10-20 specific use cases, driven by cost, privacy, and control requirements.

While AI training requires massive, centralized data centers, the growth of inference workloads is creating a need for a new architecture. This involves smaller (e.g., 5 megawatt), decentralized clusters located closer to users to reduce latency. This shift impacts everything from data center design to the software required to manage these distributed fleets.

The excitement around AI capabilities often masks the real hurdle to enterprise adoption: infrastructure. Success is not determined by the model's sophistication, but by first solving foundational problems of security, cost control, and data integration. This requires a shift from an application-centric to an infrastructure-first mindset.

The primary driver for running AI models on local hardware isn't cost savings or privacy, but maintaining control over your proprietary data and models. This avoids vendor lock-in and prevents a third-party company from owning your organization's 'brain'.

The Future of Enterprise AI is a Hybrid "Distributed Spoke" Infrastructure | RiffOn