Design the Twitter News Feed

System Design
Medium
Google
100.5K views

Design the system that generates the news feed for Twitter/X. Focus on the fan-out mechanism (push vs. pull), feed ranking, and handling celebrity users (hot spots).

Why Interviewers Ask This

Interviewers ask this to evaluate your ability to balance scalability with real-time performance in high-traffic systems. They specifically test your understanding of fan-out patterns, how to handle hot spots like celebrity users without crashing the system, and your capacity to prioritize trade-offs between consistency and availability.

How to Answer This Question

1. Clarify Requirements: Define scope (read vs. write QPS), latency goals, and scale (e.g., 500M daily active users). 2. High-Level Architecture: Sketch a flow from Tweet creation to Feed retrieval, identifying core components like APIs, databases, and caches. 3. Fan-Out Strategy: Debate Push vs. Pull models; recommend Hybrid for Twitter's specific mix of casual and celebrity users. 4. Hot Spot Handling: Detail how to isolate celebrity feeds using pre-computation or specialized queues to prevent cache stampedes. 5. Ranking & Refinement: Briefly explain how to integrate ML-based ranking logic post-retrieval. 6. Trade-offs: Conclude by discussing consistency, storage costs, and failure scenarios.

Key Points to Cover

  • Propose a Hybrid Fan-Out strategy to balance load between push and pull
  • Explicitly address the 'Celebrity Hot Spot' problem with isolation techniques
  • Differentiate between write optimization and read optimization paths
  • Demonstrate awareness of caching layers (Redis/Memcached) for low latency
  • Articulate clear trade-offs regarding data consistency versus availability

Sample Answer

To design Twitter's feed, I first clarify that we need sub-second latency for billions of reads while handling massive write spikes. The core challenge is the fan-out mechanism. A pure pull model is too slow for millions…

Common Mistakes to Avoid

  • Ignoring the scale difference between regular users and celebrities, leading to an inefficient all-push or all-pull solution
  • Focusing solely on database schema without explaining the real-time data propagation mechanism
  • Overlooking the impact of network latency when aggregating feeds from multiple sources
  • Failing to define clear metrics for success, such as tail latency requirements or throughput targets

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 173 System Design questionsBrowse all 129 Google questions