Design a Fraud Detection System

System Design
Hard
Stripe
113.3K views

Design a real-time system to detect fraudulent transactions (e.g., credit card fraud). Focus on feature engineering, low-latency prediction models, and dealing with false positives/negatives.

Why Interviewers Ask This

Interviewers at Stripe ask this to evaluate your ability to balance extreme low-latency requirements with high-accuracy machine learning in a distributed environment. They specifically test if you understand the nuances of real-time data streaming, feature engineering for fraud patterns, and how to manage the critical trade-off between false positives that frustrate users and false negatives that cost money.

How to Answer This Question

1. Clarify Requirements: Immediately define constraints like latency (sub-100ms), throughput (millions of TPS), and the business cost of errors. Mention Stripe's focus on developer experience and seamless payments. 2. High-Level Architecture: Propose a Lambda or Kappa architecture using Kafka for ingestion and a serving layer like Redis or specialized ML inference engines for speed. 3. Feature Engineering Strategy: Discuss calculating real-time features such as velocity checks (transactions per minute) and geolocation anomalies alongside static user history. 4. Model Selection & Training: Explain using lightweight models (like Gradient Boosted Trees) for online scoring and deep learning for batch retraining. Address class imbalance techniques like SMOTE or focal loss. 5. Feedback Loop: Describe how to handle false positives by creating an easy appeal process and feeding rejected transactions back into the training pipeline for continuous learning.

Key Points to Cover

  • Demonstrating knowledge of sub-100ms latency constraints typical in payment processing
  • Explaining specific real-time feature engineering strategies like velocity checks
  • Balancing the trade-off between false positives (user friction) and false negatives (financial loss)
  • Proposing a scalable streaming architecture using tools like Kafka and Redis
  • Defining a continuous feedback loop to improve model accuracy over time

Sample Answer

To design a fraud detection system for a platform like Stripe, I would prioritize sub-100 millisecond latency while maintaining high recall. First, I'd ingest transaction events via Apache Kafka to ensure scalability. Fo…

Common Mistakes to Avoid

  • Focusing only on offline batch processing and ignoring the strict real-time requirement
  • Suggesting complex deep learning models without considering inference latency and cost
  • Neglecting to discuss how to handle imbalanced datasets where fraud is rare
  • Failing to mention a strategy for handling false positives and customer experience impact

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 190 System Design questionsBrowse all 57 Stripe questions