System design interviews can make or break your chances at top tech companies. Unlike coding interviews with clear right answers, system design requires you to navigate ambiguity, make trade-offs, and communicate complex ideas clearly.

This cheat sheet gives you a structured approach and quick references for your next system design interview at Google, Meta, Amazon, Netflix, or any top company. For a deeper dive into system design concepts, see our Complete Guide to System Design.

The 5-Step Framework

Use this framework for every system design interview:

Step 1: Clarify Requirements (3-5 minutes)

Never start designing without asking questions first.

Functional Requirements:

What are the core features?
Who are the users?
What should users be able to do?

Non-Functional Requirements:

What's the expected scale? (users, requests/second)
What latency is acceptable?
What availability is required? (99.9%? 99.99%?)
Any compliance requirements? (GDPR, HIPAA)

Example Questions:

"How many daily active users should we design for?"
"What's the read-to-write ratio?"
"Should we prioritize consistency or availability?"
"Is this a global service or regional?"

Step 2: Estimate Scale (2-3 minutes)

Do back-of-envelope calculations to inform your design.

Traffic Estimates:

DAU = 100M users
Avg requests per user = 10/day
Total daily requests = 1B
Requests per second = 1B / 86400 ≈ 12K RPS

Storage Estimates:

New records per day = 10M
Record size = 1KB
Daily storage = 10GB
Annual storage = 3.6TB
5-year storage = 18TB

Memory Estimates (for caching):

Hot data = 20% of total
Cache size = 3.6TB * 0.2 = 720GB

Step 3: High-Level Design (10-15 minutes)

Draw the major components and their interactions.

Standard Components:

Clients (web, mobile, API consumers)
Load Balancer
API Gateway
Application Servers
Cache Layer (Redis/Memcached)
Database (primary + replicas)
Message Queue
Background Workers
CDN (for static content)
Monitoring/Logging

Draw the Flow:

Start with the client
Show request path through load balancer
Show application tier
Show data layer
Show async processing (if applicable)

Step 4: Deep Dive (15-20 minutes)

Pick 2-3 critical components and explain them in detail.

For Each Component:

Why this technology choice?
How does it handle scale?
What are the failure modes?
How do you monitor it?

Common Deep Dives:

Database schema and query patterns
Caching strategy and invalidation
API design and authentication
Data partitioning strategy

Step 5: Wrap Up (3-5 minutes)

Address remaining concerns.

Discuss:

Bottlenecks and how to address them
Single points of failure
Future scaling considerations
Monitoring and alerting strategy
Security considerations

Quick Reference: Components

Load Balancers

When to Use: Always, when you have multiple servers

Options:

Type	Example	Use Case
L4 (Transport)	AWS NLB	High throughput, TCP/UDP
L7 (Application)	AWS ALB, NGINX	HTTP routing, SSL termination

Algorithms:

Round Robin: Simple, equal distribution
Least Connections: Route to least busy server
IP Hash: Session affinity

Databases

Relational (SQL):

PostgreSQL, MySQL
ACID transactions
Complex queries, JOINs
Vertical scaling primarily

Document (NoSQL):

MongoDB, DynamoDB
Flexible schemas
Horizontal scaling
Eventual consistency

Key-Value:

Redis, Memcached
Ultra-fast reads
Simple data model
Great for caching

When to Choose:

Need transactions? → SQL
Need flexibility? → Document
Need speed? → Key-Value
Need relationships? → SQL or Graph

Caching

Cache-Aside Pattern:

1. Check cache
2. If miss, read from DB
3. Write to cache
4. Return data

Cache Invalidation:

TTL: Simplest, eventual staleness
Write-through: Update cache on every write
Event-based: Invalidate on data change

What to Cache:

Database query results
API responses
Session data
Computed values

Message Queues

When to Use:

Async processing needed
Decouple services
Handle traffic spikes
Ensure delivery

Options:

Queue	Best For
Kafka	High throughput, event streaming
RabbitMQ	Task queues, routing logic
SQS	AWS integration, simplicity
Redis Streams	Low latency, real-time

CDN

What to Cache:

Static assets (images, CSS, JS)
API responses (with appropriate TTL)
HTML pages

Benefits:

Reduce latency (edge locations)
Offload origin traffic
DDoS protection

Quick Reference: Patterns

Database Scaling

Read Replicas:

Create read-only copies
Route reads to replicas
Master handles writes
Eventually consistent

Sharding:

Split data across databases
Choose good shard key
Avoid cross-shard queries

Sharding Strategies:

Hash-based: hash(user_id) % num_shards
Range-based: users A-M shard 1, N-Z shard 2
Geographic: US shard, EU shard

API Patterns

REST:

GET    /users/123      # Read user
POST   /users          # Create user
PUT    /users/123      # Update user
DELETE /users/123      # Delete user

Pagination:

Offset-based: ?page=2&limit=20
Cursor-based: ?cursor=abc123&limit=20 (preferred for large datasets)

Rate Limiting:

Token bucket: Smooth traffic
Sliding window: Rolling count

Consistency Patterns

Strong Consistency:

All reads see latest write
Higher latency
Use for: Financial transactions

Eventual Consistency:

Reads may see stale data temporarily
Lower latency, higher availability
Use for: Social feeds, likes, views

Read-Your-Writes:

User sees their own writes immediately
Others may see stale data
Use for: Profile updates

Common Interview Questions

URL Shortener (bit.ly)

Key Points:

Generate unique short codes
Redirect to original URL
Analytics (optional)

Scale:

100M URLs created/month
10B redirects/month (100:1 read ratio)

Components:

Application servers (stateless)
Key-Value store for URL mapping
Counter service for analytics
Cache for hot URLs

Short Code Generation:

Hash + collision handling
Counter + base62 encoding
UUID (longer but simple)

Social Media Feed (Twitter)

Key Points:

Timeline generation
Fan-out on write vs. read
Real-time updates

Scale:

500M DAU
12K writes/sec, 600K reads/sec

Approaches:

Fan-out on Write: Pre-compute timelines
- Pro: Fast reads
- Con: Slow writes for celebrities
Fan-out on Read: Compute on request
- Pro: Fast writes
- Con: Slow reads

Hybrid: Fan-out for regular users, on-read for celebrities

Chat System (WhatsApp)

Key Points:

Real-time messaging
Delivery guarantees
Online presence

Components:

WebSocket servers (persistent connections)
Message queue (for reliability)
Database (for history)
Presence service

Message Flow:

Client A sends message via WebSocket
Server queues message
Server pushes to Client B (if online)
Store in DB
Acknowledge to Client A

Video Streaming (Netflix)

Key Points:

Video upload and processing
Adaptive bitrate streaming
Global CDN

Components:

Upload service (chunked uploads)
Transcoding pipeline (multiple resolutions)
CDN (edge caching)
Recommendation service

Transcoding Pipeline:

Upload to blob storage
Queue transcoding job
Generate multiple bitrates (240p to 4K)
Store all versions
Update metadata

Rate Limiter

Key Points:

Limit requests per user/IP
Handle distributed servers

Algorithms:

Token Bucket: Bucket refills at rate R, max capacity B
Sliding Window Log: Track all timestamps, count in window
Sliding Window Counter: Approximate with weighted windows

Distributed Rate Limiting:

Use Redis for shared state
Lua scripts for atomicity
Accept some over-counting for performance

Numbers to Know

Latency:

Operation	Time
L1 cache reference	0.5 ns
L2 cache reference	7 ns
RAM reference	100 ns
SSD read	150 μs
HDD read	10 ms
Network round trip (same DC)	0.5 ms
Network round trip (cross-region)	100 ms

Throughput:

Operation	Throughput
SSD sequential read	500 MB/s
HDD sequential read	100 MB/s
1 Gbps network	125 MB/s
10 Gbps network	1.25 GB/s

Capacity:

Unit	Value
1 KB	1,000 bytes
1 MB	1,000 KB
1 GB	1,000 MB
1 TB	1,000 GB
1 PB	1,000 TB

Time:

Period	Seconds
1 day	86,400
1 month	2.6M
1 year	31.5M

Interview Tips

Do:

Ask clarifying questions first
Think out loud
Draw diagrams
Discuss trade-offs
Mention monitoring and security
Be honest about unknowns

Don't:

Jump into design without requirements
Over-engineer the solution
Ignore non-functional requirements
Skip the math
Be defensive about feedback

Communication:

"Let me start by clarifying requirements..."
"Given the scale, I'm thinking..."
"The trade-off here is..."
"One concern with this approach is..."
"If we had more time, we could..."

Practice Resources

Visualize Your Designs: Use InfraSketch to quickly generate architecture diagrams from descriptions. Practice articulating your designs in natural language and see them come to life. Learn more about creating effective diagrams in our Architecture Diagram Best Practices guide.

Practice Problems:

Design Twitter
Design WhatsApp
Design Netflix
Design Uber
Design Google Search
Design Dropbox
Design Instagram
Design Yelp
Design Amazon
Design Zoom

Study Real Systems:

Netflix Tech Blog
Uber Engineering Blog
Meta Engineering Blog
AWS Architecture Center
Google Cloud Architecture Framework

Conclusion

System design interviews test your ability to:

Gather requirements and handle ambiguity
Break down complex problems
Make and justify trade-offs
Communicate technical ideas clearly

Use this cheat sheet as a reference, but remember: practice is what makes the difference. Work through problems, draw diagrams, and get comfortable explaining your reasoning.

Good luck with your interviews!

System Design Interview Cheat Sheet: The Ultimate Guide

The 5-Step Framework

Step 1: Clarify Requirements (3-5 minutes)

Step 2: Estimate Scale (2-3 minutes)

Step 3: High-Level Design (10-15 minutes)

Step 4: Deep Dive (15-20 minutes)

Step 5: Wrap Up (3-5 minutes)

Quick Reference: Components

Load Balancers

Databases

Caching

Message Queues

CDN

Quick Reference: Patterns

Database Scaling

API Patterns

Consistency Patterns

Common Interview Questions

URL Shortener (bit.ly)

Social Media Feed (Twitter)

Chat System (WhatsApp)

Video Streaming (Netflix)

Rate Limiter

Numbers to Know

Interview Tips

Do:

Don't:

Communication:

Practice Resources

Conclusion

Try InfraSketch Tools

Generate a Diagram

System Design Tool

Design Doc Generator

All Tools

System Design Interview Cheat Sheet: The Ultimate Guide

The 5-Step Framework

Step 1: Clarify Requirements (3-5 minutes)

Step 2: Estimate Scale (2-3 minutes)

Step 3: High-Level Design (10-15 minutes)

Step 4: Deep Dive (15-20 minutes)

Step 5: Wrap Up (3-5 minutes)

Quick Reference: Components

Load Balancers

Databases

Caching

Message Queues

CDN

Quick Reference: Patterns

Database Scaling

API Patterns

Consistency Patterns

Common Interview Questions

URL Shortener (bit.ly)

Social Media Feed (Twitter)

Chat System (WhatsApp)

Video Streaming (Netflix)

Rate Limiter

Numbers to Know

Interview Tips

Do:

Don't:

Communication:

Practice Resources

Conclusion

Try InfraSketch Tools

Generate a Diagram

System Design Tool

Design Doc Generator

All Tools

Related Articles

How to Design Twitter/X Architecture: System Design Case Study

System Design Interview Prep: Ultimate Practice Guide for 2026

Best System Architecture Diagramming Tools 2026: 8 Tools Ranked