Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The choice of cloud provider for hosting external models (e.g., AWS SageMaker vs. Google Vertex AI) has direct consequences for which ML frameworks are supported. For example, Pega's Vertex AI integration supports XGBoost but not TensorFlow or PyTorch, unlike its broader SageMaker support. This is a critical upfront technical consideration.

Related Insights

A new category of "NeoCloud" or "AI-native cloud" is rising, focusing specifically on AI training and inference. Unlike general-purpose clouds like AWS, these platforms are GPU-first, catering to massive AI workloads and addressing the GPU scarcity and different workload patterns found in hyperscalers.

The metadata file in Pega's Prediction Studio does more than describe a model. It defines the runtime contract, linking model inputs to Pega properties, dictating performance metrics (AUC, F-score), and ensuring correct response tracking. This file is critical for runtime correctness and monitoring, not just for setup.

The top 1% of AI companies making significant revenue don't rely on popular frameworks like Langchain. They gain more control and performance by using small, direct LLM calls for specific application parts. This avoids the black-box abstractions of frameworks, which are more common among the other 99% of builders.

The most effective integrations use external ML models as specialized scoring components within Pega's broader decisioning framework. The model's score should influence outcomes like prioritization and eligibility, but it should operate alongside, not in place of, existing business rules, eligibility criteria, and contact policies.

To get scientists to adopt AI tools, simply open-sourcing a model is not enough. A real product must provide a full-stack solution, including managed infrastructure to run expensive models, optimized workflows, and a UI. This abstracts away the complexity of MLOps, allowing scientists to focus on research.

An AI model's operating environment—its "harness"—is now the primary driver of capability. Benchmarks show the same model achieves vastly different results in different harnesses, proving that the runtime, tools, and state management are as critical as the model's internal weights for achieving results.

Relying solely on premium models like Claude Opus can lead to unsustainable API costs ($1M/year projected). The solution is a hybrid approach: use powerful cloud models for complex tasks and cheaper, locally-hosted open-source models for routine operations.

The host notes that while Gemini 3.0 is available in other IDEs, he achieves higher-quality designs by using the native Google AI Studio directly. This suggests that for maximum performance and feature access, creators should use the first-party platform where the model was developed.

Instead of simply swapping a model behind a stable URL, Pega's platform enables a formal release process. Using Prediction Studio's champion/challenger slots and percentage-based rollouts, teams can safely deploy, monitor, and manage new model versions. This MLOps capability turns model updates into a governed, transparent activity.

Pega's CTO advises using the powerful reasoning of LLMs to design processes and marketing offers. However, at runtime, switch to faster, cheaper, and more consistent predictive models. This avoids the unpredictability, cost, and risk of calling expensive LLMs for every live customer interaction.