Design an Image Moderation Service (NSFW Detection)

System Design
Hard
Meta
126.5K views

Design a system that uses machine learning models to automatically detect and flag inappropriate images/videos upon upload. Focus on asynchronous processing and human review queues.

Why Interviewers Ask This

Interviewers at Meta ask this to evaluate your ability to balance high-scale system reliability with critical safety requirements. They specifically assess how you handle asynchronous processing for media-heavy workloads, manage trade-offs between model latency and accuracy, and design robust human-in-the-loop review queues for edge cases that automated systems miss.

How to Answer This Question

1. Clarify Scope: Immediately define constraints like image resolution, expected throughput (e.g., millions per second), and the specific definition of 'inappropriate' content. 2. High-Level Architecture: Propose a decoupled architecture using an object storage layer (like S3) feeding into an event queue (Kafka) to handle spikes in upload traffic. 3. Asynchronous Processing Pipeline: Detail the flow where workers pull messages from the queue, invoke lightweight pre-filters followed by heavy ML models, and store results back to the database. 4. Human Review Integration: Design a fallback mechanism where low-confidence predictions or specific categories trigger a ticket in a human moderation dashboard with priority queuing. 5. Scaling & Optimization: Discuss horizontal scaling strategies for worker nodes, model versioning, and caching frequent requests to reduce latency while maintaining strict data privacy standards.

Key Points to Cover

  • Explicitly separating ingestion, processing, and review layers to prevent bottlenecks
  • Using a message queue to decouple upload traffic from compute-intensive ML inference
  • Implementing a confidence threshold logic to route uncertain cases to human reviewers
  • Addressing the need for horizontal scaling of worker nodes to handle variable load
  • Considering data privacy and model retraining pipelines as part of the lifecycle

Sample Answer

To design a scalable Image Moderation Service, I would start by defining the core requirement: processing millions of uploads asynchronously without blocking the user experience. The system begins when a user uploads an…

Common Mistakes to Avoid

  • Proposing synchronous processing which would cause unacceptable latency for users uploading images
  • Ignoring the human review component entirely, assuming AI can achieve 100% accuracy
  • Failing to address how the system handles sudden traffic spikes or bursty upload patterns
  • Overlooking the importance of storing metadata about why an image was flagged for audit trails

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 173 System Design questionsBrowse all 71 Meta questions