Design a Twitter Feed (Sorted Set/Time Series)

Data Structures
Medium
Meta
56.5K views

Design the data structures required to maintain a user's chronological Twitter/X feed, supporting billions of posts. Focus on time series databases or Redis sorted sets.

Why Interviewers Ask This

Interviewers ask this to evaluate your ability to model time-series data under extreme scale constraints. They specifically want to see if you understand the trade-offs between push and pull architectures for news feeds, how Redis Sorted Sets handle ranking by timestamp, and your skill in optimizing read-heavy workloads for billions of records.

How to Answer This Question

1. Clarify requirements: Define read/write ratios, latency goals (e.g., <200ms), and consistency needs typical of Meta's high-scale environment. 2. Propose a hybrid architecture: Explain that writing is optimized via a 'Push' fan-out to follower lists, while reading uses pre-computed timelines stored in sorted sets. 3. Detail the Data Structure: Describe using Redis Sorted Sets where the score is the Unix timestamp and the member is the post ID, ensuring O(log N) insertion and retrieval. 4. Address Edge Cases: Discuss handling user un-follows, spam filtering, and the memory cost of storing millions of feed entries per user. 5. Optimize for Scale: Mention caching strategies and sharding approaches to distribute load across clusters, demonstrating awareness of distributed system limitations.

Key Points to Cover

  • Explicitly choosing between Push vs. Pull architectures based on follower count
  • Leveraging Redis Sorted Sets for O(log N) time-based sorting and range queries
  • Addressing the 'Fan-out' explosion problem for users with millions of followers
  • Defining clear trade-offs between read latency and write throughput
  • Incorporating soft-deletion strategies to handle content removal without performance penalties

Sample Answer

To design a scalable Twitter feed for billions of posts, I would prioritize low-latency reads over strong write consistency, which aligns with Meta's focus on user experience at scale. The core challenge is balancing the…

Common Mistakes to Avoid

  • Suggesting a pure SQL relational database for real-time feed generation, ignoring performance bottlenecks
  • Ignoring the memory overhead of storing duplicate tweets for every follower in a naive push model
  • Failing to distinguish between active and inactive users when selecting the feed generation strategy
  • Overlooking the need for eventual consistency in a distributed environment

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 161 Data Structures questionsBrowse all 71 Meta questions