Design a Distributed Counter Service

System Design
Medium
Google
23.6K views

Design a service to reliably increment/decrement millions of shared counters (e.g., likes, views) across distributed systems. Discuss eventual vs. strong consistency.

Why Interviewers Ask This

Interviewers at Google ask this to evaluate your ability to balance trade-offs between consistency, availability, and partition tolerance in high-scale environments. They specifically want to see if you can design a system that handles millions of concurrent writes without locking bottlenecks, while making informed decisions about eventual versus strong consistency based on business requirements.

How to Answer This Question

1. Clarify requirements immediately: Ask about read-to-write ratios, latency constraints, and whether the counter needs to be accurate in real-time or if eventual consistency is acceptable. 2. Define the scope: Determine if counters are global or per-user, and estimate the throughput (e.g., millions of QPS). 3. Propose a baseline architecture: Suggest a sharded key-value store where each shard manages a subset of counters to distribute load. 4. Address the core challenge: Discuss how to handle atomic increments using Redis or Memcached with Lua scripts for speed, or a database with optimistic locking. 5. Resolve consistency conflicts: Explain strategies like vector clocks or last-writer-wins for merging updates across regions. 6. Optimize for scale: Mention caching layers, batched writes to reduce I/O, and asynchronous replication to ensure high availability.

Key Points to Cover

  • Explicitly choosing eventual consistency for high-volume metrics to maximize performance
  • Using sharding to eliminate single points of failure and distribute write load
  • Leveraging atomic operations in memory stores like Redis for sub-millisecond latency
  • Explaining the trade-off between data accuracy and system availability clearly
  • Demonstrating knowledge of CAP theorem implications in a real-world scenario

Sample Answer

To design a distributed counter service for millions of operations, I first clarify that for metrics like 'likes' or 'views', eventual consistency is usually sufficient, allowing us to prioritize availability and low lat…

Common Mistakes to Avoid

  • Jumping straight to a SQL database solution without considering write throughput limitations
  • Ignoring the need for sharding, leading to a design that cannot scale horizontally
  • Failing to distinguish between the needs of a counter versus a financial transaction ledger
  • Overcomplicating the solution with complex consensus algorithms like Paxos when simpler methods suffice

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 190 System Design questionsBrowse all 145 Google questions