All posts
System DesignInterviewsArchitecture

Use Mermaid diagrams to ace system design interviews

7 min readThe MermaidCreator team

System design interviews are about communication under pressure. The interviewer watches you think, asks what you'd do differently, and probes your trade-offs. A clear diagram is the difference between "I think we need a cache somewhere" and "here's the exact flow when a cache miss happens, and here's why we chose Redis over Memcached."

Mermaid makes it possible to sketch a distributed system in 90 seconds, not 10 minutes of whiteboard fumbling.

Why diagrams win in interviews

Speed: You can draw a multi-service architecture faster than you can describe it in words. The interviewer sees the system instantly instead of piecing it together from your explanation.

Precision: "We put a queue in front of the API" is vague. A diagram showing the queue between the load balancer and the API service, with workers consuming from the queue, is exact. Interviewers probe exact designs, not vague ones.

Revision: When an interviewer says "what if we scaled this to 10 million users?", you redraw one piece of the diagram instead of re-explaining the entire system.

Confidence: Drawing from a mental model (not making it up on the fly) signals you've thought about the problem before.

The core pattern: request flow

Every system design interview starts with the same question: "How does a request flow through your system?"

Draw three layers:

graph LR
    Client["Client"]
    LB["Load Balancer"]
    API["API Service"]
    DB["Database"]
    
    Client -->|HTTP| LB
    LB -->|Route| API
    API -->|Query| DB
    DB -->|Result| API
    API -->|JSON| LB
    LB -->|Response| Client

This is your baseline. Everything else adds to it. When the interviewer asks about caching, add a cache node. About asynchrony, add a queue. About scaling, add replicas.

Multi-service architecture with data flow

Most interviews ask you to design for scale. Your system probably has multiple services:

graph LR
    Client["Browser"]
    CDN["CDN"]
    LB["Load Balancer"]
    
    API["API Gateway"]
    Auth["Auth Service"]
    Diagram["Diagram Service"]
    User["User Service"]
    
    Cache["Redis Cache"]
    Queue["Message Queue"]
    Search["Search Index"]
    
    DB[(Database)]
    S3["Object Storage"]
    
    Client -->|Static| CDN
    Client -->|API| LB
    LB --> API
    API --> Auth
    API --> Diagram
    API --> User
    Auth --> DB
    Diagram --> Cache
    Diagram --> Queue
    Diagram --> DB
    Diagram --> S3
    User --> DB
    Queue --> Search

This diagram communicates:

  • Clients hit the load balancer first
  • API Gateway routes to service teams
  • Diagram service uses cache for frequently accessed diagrams, queue for async work, and object storage for images
  • Every service ultimately reads from the same database

Now when the interviewer asks "where do you add replication?", point at the DB. "Where do you handle traffic spikes?" Point at the queue. You're not inventing answers; you're reading from the architecture you drew.

Sequence diagram for a complex flow

When the interviewer asks "walk me through an upload," switch to a sequence diagram:

sequenceDiagram
    participant User
    participant API
    participant Queue
    participant Worker
    participant Storage
    participant DB

    User->>API: POST /upload (file, metadata)
    API->>API: Validate schema
    API->>Storage: Store file
    Storage-->>API: File URL
    API->>Queue: Enqueue {fileId, url}
    API-->>User: 202 Accepted {jobId}
    
    Note over Queue,Worker: Async processing
    
    Worker->>Queue: Consume job
    Worker->>Storage: Download file
    Storage-->>Worker: File data
    Worker->>Worker: Process (resize, extract metadata)
    Worker->>DB: INSERT processed_result
    Worker->>Queue: Ack job

This shows:

  • The API doesn't wait for processing (202 Accepted instead of 200)
  • The file lives in object storage, not the database
  • Heavy lifting happens asynchronously by a worker pool
  • The database is only updated after processing succeeds

When the interviewer asks "what if the worker crashes?", you can add a retry strategy to the diagram. If they ask "how do you make this resilient?", add idempotency keys and dead-letter queues.

Common patterns to have ready

Caching layer (Redis):

graph LR
    API["API Service"]
    Cache["Redis Cache"]
    DB[(Database)]
    
    API -->|Check cache| Cache
    Cache -->|Hit| API
    Cache -->|Miss| DB
    DB -->|Cache result| Cache

Queue for async work:

graph LR
    API["API Service"]
    Queue["Message Queue"]
    Worker["Worker Pool"]
    
    API -->|Enqueue| Queue
    Queue -->|Dequeue| Worker
    Worker -->|Process| DB[(Database)]

Database read replicas:

graph LR
    API["API Service"]
    WriteDB[("Primary DB<br/>Writes")]
    ReadDB1[("Read Replica<br/>Region A")]
    ReadDB2[("Read Replica<br/>Region B")]
    
    API -->|Write| WriteDB
    API -->|Read| ReadDB1
    API -->|Read| ReadDB2
    WriteDB -.->|Replicate| ReadDB1
    WriteDB -.->|Replicate| ReadDB2

Sharding by user_id:

graph LR
    API["API Service"]
    Router["Shard Router"]
    
    Shard1[("Shard 1<br/>user_id 0-999")]
    Shard2[("Shard 2<br/>user_id 1000-1999")]
    Shard3[("Shard 3<br/>user_id 2000+")]
    
    API -->|user_id| Router
    Router -->|hash(id) % 3| Shard1
    Router -->|hash(id) % 3| Shard2
    Router -->|hash(id) % 3| Shard3

Estimating and showing constraints

When you draw, label with capacity numbers:

graph LR
    LB["Load Balancer<br/>10K req/sec"]
    API1["API #1<br/>2K req/sec"]
    API2["API #2<br/>2K req/sec"]
    API3["API #3<br/>2K req/sec"]
    API4["API #4<br/>2K req/sec"]
    API5["API #5<br/>2K req/sec"]
    
    DB[("Database<br/>100K IOPS<br/>500GB RAM")]
    
    LB --> API1
    LB --> API2
    LB --> API3
    LB --> API4
    LB --> API5
    
    API1 --> DB
    API2 --> DB
    API3 --> DB
    API4 --> DB
    API5 --> DB

This tells the interviewer you've thought about the math. You're not just drawing random boxes; you're sizing the infrastructure for the requirements.

Handling edge cases

When the interviewer asks about failure modes, add them to the diagram:

graph LR
    API["API Service"]
    Cache["Redis Cache"]
    DB[(Database)]
    
    API -->|Try cache first| Cache
    Cache -->|HIT| API
    Cache -->|MISS or TIMEOUT| DB
    DB -->|Populate cache<br/>or skip if cache down| Cache
    DB -->|Serve stale<br/>if DB unavailable| Cache

Add notes about your strategy: circuit breakers, fallbacks, graceful degradation.

Interview day checklist

Before your interview, practice these diagrams:

  • Simple monolithic service (3 boxes: client, API, database)
  • Caching layer added (4 boxes)
  • Async work with a queue (5 boxes)
  • Multi-service architecture (10+ boxes)
  • A sequence diagram for a complex flow (user upload, payment processing, real-time update)

Draw each from scratch 3–5 times. By interview day, you should draw them without thinking.

Tools for practice

Use the MermaidCreator editor to practice. It's faster than a whiteboard or a notepad, and you can save your diagrams to review later. When you're in the actual interview (on a physical whiteboard or shared screen), your muscle memory kicks in and you draw just as fast.

Common mistakes to avoid

Over-complexity too early. Start simple. Draw monolith + database. Add pieces as the interviewer asks. Many candidates over-engineer day one.

Missing labels. "This is the cache" is better than a box with no description. Label every component and every edge.

Wrong direction. Draw request flow left-to-right (client to server). Draw data flow downward. Consistency helps the interviewer follow.

All single points of failure. Have a reason for every replica, every cache, every queue. Diagram redundancy only where you need it for reliability or scale.

Forgetting the database. Candidates sometimes diagram an entire system and forget to show where the data lives. Always have a database or persistence layer.

FAQ

Should I practice on paper or a computer? Practice on a computer (MermaidCreator, draw.io) because it's faster and closer to actual interview tools. The interview might be on a whiteboard, shared screen, or Excalidraw — all are faster with keyboard than markers.

What if I'm not an artist? You're not being graded on aesthetics. Ugly boxes with clear labels beat pretty boxes you can't explain. Focus on clarity and speed, not visual polish.

How do I recover if I forget a component? Say "I forgot to add X." Redraw. Good engineers catch their own mistakes. Interviewers respect that more than pretending the incomplete system is intentional.

Should I memorize specific architectures? No. Understand the patterns (caching, queuing, sharding, replication) and mix them based on the problem. Every interview is slightly different.

Spend 30 minutes today practicing the basic patterns in the MermaidCreator editor. By your interview, drawing a system will be as natural as writing pseudocode. Try diagramming an architecture in the playground — start with a simple API + database, then add a cache, queue, and load balancer, layer by layer.

Related posts