Binary decisions are brittle. For payments that are neither clearly safe nor clearly fraudulent, Stripe uses a "soft block." This triggers a 3DS authentication step, allowing legitimate users to proceed while stopping fraudsters, resolving ambiguity without losing revenue.
For complex cases like "friendly fraud," traditional ground truth labels are often missing. Stripe uses an LLM to act as a judge, evaluating the quality of AI-generated labels for suspicious payments. This creates a proxy for ground truth, enabling faster model iteration.
Stripe avoids costly system rebuilds by treating its new payments foundation model as a modular component. Its powerful embeddings are simply added as new features to many existing ML classifiers, instantly boosting their performance with minimal engineering effort.
Stripe's AI model processes payments as a distinct data type, not just text. It analyzes transaction sequences across buyers, cards, devices, and merchants to uncover complex fraud patterns invisible to humans, boosting card testing detection from 59% to 97%.
Stripe’s payments model shows how AI creates powerful data flywheels. Their massive, proprietary transaction dataset trains superior models, which improves the product, attracts more customers, and widens their data advantage, making it nearly impossible for new competitors to catch up.
Instead of teams building their own merchant analysis tools, Stripe created a centralized "Merchant Intelligence" service. This AI agent crawls the web, generates merchant embeddings, and serves insights to diverse teams like risk, credit, and sales, eliminating duplicated effort and creating massive internal leverage.
Users distrust "talk to your data" tools they don't understand. Stripe's Sigma product overcomes this by generating a natural language explanation alongside every answer. It details assumptions made, like the specific dates used for "Black Friday," allowing non-technical users to verify the logic.
Purely model-based or rule-based systems have flaws. Stripe combines them for better results. For instance, a transaction with a CVC code mismatch (a rule) is only blocked if its model-generated risk score is also elevated, preventing rejection of good customers who make simple mistakes.
Emily Sands advises startups against building their own databases to mirror Stripe's financial data. Instead, they should treat Stripe's highly reliable APIs (six nines uptime) as their system of record. This eliminates complex reconciliation work, freeing up scarce engineering resources for core product development.
