5. High-Level Design¶
This step draws the overall architecture. It shows the major components and how requests move through the system.
Typical Building Blocks¶
- Clients and edge delivery.
- Load balancers and API gateways.
- Application servers or service tiers.
- Cache layers.
- Databases and replicas.
- Message queues and workers.
- Object storage, search, and analytics systems.
What to Explain¶
- The read path.
- The write path.
- Where caching happens.
- Where asynchronous processing happens.
- How the system scales horizontally.
- Which parts are stateful and which are stateless.
- Which components are on the critical path for latency.
Common Diagram Elements¶
- Clients, load balancers, and API gateways.
- Application servers or services.
- Cache layers and CDN edges.
- Databases, replicas, and partitions.
- Background workers and queues.
- Search or analytics pipelines when relevant.
Typical Scalable Architecture Stack¶
A reference pattern for most systems:
Clients
↓
CDN (static assets, media)
↓
Load Balancer (distribute traffic)
↓
API Gateway / Reverse Proxy (auth, rate limiting, routing)
↓
Application Servers (stateless, horizontally scalable)
↓
Cache Layer (Redis/Memcached for hot data)
↓
Primary Database (writes)
↓
Read Replicas (scale read traffic)
↓
Message Queue (async jobs: Kafka, RabbitMQ)
↓
Workers (background processing)
↓
Search Index (Elasticsearch for queries)
↓
Object Storage (S3 for media)
Component-Based Checklist¶
For each major component, ask:
| Component | Design Decisions |
|---|---|
| Load Balancer | Round-robin, least-loaded, layer-7 routing? Active-passive or active-active? |
| Reverse Proxy / API Gateway | Centralized auth, rate limiting, request transformation, circuit breaker? |
| Application Servers | Stateless? Horizontal scaling? Auto-scaling triggers? |
| Cache Layer | Redis or Memcached? Which data is hot? TTL and eviction (LRU, LFU)? |
| Primary Database | SQL for transactions, NoSQL for scale? Sharding key? Replication lag tolerance? |
| Read Replicas | How many? Cross-region? Eventual consistency OK? |
| Search Index | Elasticsearch for text? Inverted indexes? Real-time indexing or batch? |
| Message Queue | Kafka for streams, RabbitMQ for tasks? Ordering guarantees? |
| Object Storage | S3 for media. Versioning, lifecycle policies, CDN integration? |
How to Present the Flow¶
- Start with the user request.
- Show how it reaches the system.
- Explain the main read or write path.
- Call out any async offloading.
- End with persistence and delivery to the user.
Output of This Step¶
- A block diagram or architecture sketch.
- A request flow explanation.
- The first-pass split between online and offline processing.
- A system-wide view that sets up the deep dive.
Common Mistakes¶
- Jumping into deep details too early.
- Leaving out one of the major request paths.
- Failing to explain why the chosen components fit the problem.
- Making the diagram too busy to explain clearly.