Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Criteo's models moved from using manually crafted, extremely high-dimensional sparse vectors (e.g., 2^12 features) with linear models to dense vectors (a few hundred features) automatically computed by deep learning algorithms. This shift eliminated manual feature engineering and improved model adaptability.

Related Insights

Deep learning models can process vast, unstructured datasets directly, unlike traditional machine learning which requires data scientists to pre-select and summarize variables ('features'). This automates a key data science task, freeing up teams for higher-value work.

Unlike traditional machine learning that only learns from ad clicks, deep learning analyzes the entire user population (both exposed and not exposed to ads). This comparison reveals true incremental performance, moving beyond simple conversion attribution.

Instead of manual categorization, a developer embedded all English Wikipedia articles into a vector space to identify companies. This data-driven approach created a more comprehensive market map, capturing entities beyond Wikipedia's explicit 'company' tags and revealing organic clusters based on semantic similarity.

Criteo has just milliseconds to respond to an ad request. This extreme speed requirement dictates their AI architecture, forcing them to pre-compute and cache user and product embeddings. Real-time inference is limited to fast operations with only marginal updates for the user's latest action.

Marketers no longer need complex, opaque attribution models that require data scientists to configure. By integrating channel data with CRM outcomes, AI can directly interpret what drives pipeline and revenue, providing clear, C-suite-ready insights without the need for convoluted multi-touch models and their debatable assumptions.

Stripe avoids costly system rebuilds by treating its new payments foundation model as a modular component. Its powerful embeddings are simply added as new features to many existing ML classifiers, instantly boosting their performance with minimal engineering effort.

AI's hunger for context is making search a critical but expensive component. As illustrated by Turbo Puffer's origin, a single recommendation feature using vector embeddings can cost tens of thousands per month, forcing companies to find cheaper solutions to make AI features economically viable at scale.

A key surprise in AI development was the non-linear impact of scale. Sebastian Thrun noted that while AI trained on millions of documents is 'fine,' training it on hundreds of billions creates an 'unbelievably smart' system, shocking even its creators and demonstrating data volume as a primary driver of breakthroughs.

The key differentiator for Conative.ai's deep learning approach over traditional methods isn't just a superior algorithm. It's the ability to incorporate a much larger number of input data streams (sales, marketing, inventory, etc.), creating a richer context for the AI to generate more accurate forecasts.

Criteo builds multiple, specialized foundation models (for products, user timelines, etc.) rather than a single monolithic one. The embeddings from these models are made available across the company, serving as a "warm start" to accelerate the development and improve the performance of new AI products.