Design an Image Hosting Service (Instagram/Flickr)

System Design
Medium
Meta
122.5K views

Design a service to store and retrieve billions of images. Focus on file storage (S3/BLOB), image processing/resizing, and using Content Delivery Networks (CDNs).

Why Interviewers Ask This

Interviewers at Meta ask this to evaluate your ability to architect scalable systems handling massive unstructured data. They specifically test your understanding of storage trade-offs, the critical role of CDNs in latency reduction, and how to design efficient image processing pipelines that decouple heavy compute from user requests.

How to Answer This Question

1. Clarify requirements: Define scale (billions of images), read/write ratios, and specific features like resizing or filters. 2. High-level architecture: Propose a client uploading to an API gateway, which triggers a worker queue. 3. Storage strategy: Recommend object storage like S3 for raw images, emphasizing durability over database storage. 4. Processing pipeline: Design an async workflow where workers resize images into multiple thumbnails upon upload. 5. Delivery optimization: Explain using CDNs with cache invalidation strategies to serve global users quickly. 6. Scalability: Discuss sharding strategies for metadata databases and auto-scaling worker groups to handle traffic spikes typical of social media platforms.

Key Points to Cover

  • Explicitly separating raw image storage (Object Store) from metadata storage (Database)
  • Designing an asynchronous processing pipeline using message queues to decouple uploads from resizing
  • Justifying the use of CDNs for reducing latency in a globally distributed user base
  • Addressing scalability through database sharding strategies based on user IDs
  • Discussing cost and performance trade-offs between different image formats and compression levels

Sample Answer

To design an Instagram-like service, we first clarify that while we store billions of images, reads vastly outnumber writes. We need low-latency retrieval globally. For storage, we should never put raw images in a SQL da…

Common Mistakes to Avoid

  • Storing binary image data directly in a relational database instead of using object storage
  • Forgetting to mention Content Delivery Networks, leading to high latency for international users
  • Attempting to process all image resizing synchronously, causing poor user experience during uploads
  • Ignoring cache invalidation strategies when users update their profile pictures frequently

Sound confident on this question in 5 minutes

Answer once and get a 30-second AI critique of your structure, content, and delivery. First attempt is free — no signup needed.

Try it free

Related Interview Questions

Browse all 173 System Design questionsBrowse all 71 Meta questions