Design a Distributed Unique ID Generator (Snowflake)

System Design
Medium
Google
111.5K views

Design a service to generate globally unique, monotonically increasing IDs (like Twitter's Snowflake). Discuss structure, time component, and high availability.

Why Interviewers Ask This

Interviewers at Google ask this to evaluate your ability to design scalable systems under strict constraints. They specifically test your understanding of clock synchronization, collision avoidance in distributed environments, and how to handle node failures without compromising ID uniqueness or monotonicity.

How to Answer This Question

1. Clarify Requirements: Define the scale (e.g., billions per day), latency needs, and whether IDs must be strictly monotonically increasing or just sortable. 2. Propose Structure: Outline a bit-field layout similar to Snowflake, allocating bits for timestamp, machine ID, and sequence number. 3. Address Time Handling: Discuss using high-resolution clocks and handling clock skew by falling back to sequence numbers if time doesn't advance. 4. Ensure Uniqueness: Explain how to generate unique worker IDs and handle sequence number rollover within a millisecond. 5. High Availability: Detail strategies like pre-allocating IDs or using a centralized service with replication for fault tolerance. 6. Trade-offs: Conclude by comparing your solution against alternatives like UUIDs or database auto-increments, highlighting why your approach fits Google's scale.

Key Points to Cover

  • Explicitly defining the bit-field allocation strategy for timestamp, machine ID, and sequence
  • Demonstrating a concrete strategy for handling clock skew and negative time deltas
  • Explaining how to assign unique worker IDs without creating a single point of failure
  • Discussing the trade-off between strict monotonicity and system availability during clock issues
  • Calculating capacity based on the proposed bit distribution to prove scalability

Sample Answer

To design a globally unique, monotonically increasing ID generator, I would adapt the Snowflake algorithm while addressing its specific challenges. First, I'd define the requirements: we need roughly 10 billion IDs daily…

Common Mistakes to Avoid

  • Ignoring clock synchronization issues entirely, assuming all servers have perfectly synchronized time
  • Failing to explain how to prevent duplicate IDs when the sequence number overflows within a single millisecond
  • Proposing a centralized database for every ID generation request, which creates a bottleneck and violates scalability goals
  • Overlooking the need for a unique identifier per node, leading to potential collisions if multiple services run on the same host

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 190 System Design questionsBrowse all 145 Google questions