All templates
Flowchart template

Database sharding strategy

Horizontal scaling with hash-based shard selection.

Database sharding is how teams scale beyond what a single database can handle. Instead of one giant table, you split data across multiple database instances — each shard owns a slice. A sharded system for users might have Shard 0 (users 0–33M), Shard 1 (33M–66M), Shard 2 (66M–100M), and so on.

The routing logic decides which shard a request hits: hash the user ID, take modulo shard count, and query that shard. This is deterministic and distributes load evenly. The diagram shows the decision tree after hashing — the request fans out to the target shard, returns a result, and aggregates back to the client.

When to use this template

  • Scaling-up planning — document how your team will distribute data when you outgrow replication and read replicas.
  • Onboarding engineers to a sharded system — this diagram makes the request path and shard-selection logic explicit, so new engineers understand why queries go to specific shards.
  • Incident postmortems — if a single shard went down, trace how many users were affected and what the failover looked like.

How to adapt it

Rename "user_id" to your sharding key (customer_id, account_id) and change the shard count to your actual setup. Common extensions:

  • Add a routing cache between the hash-selection node and the database to avoid recomputing shard location on every request.
  • Layer in replica handling — show read replicas alongside each shard for failover and read scaling.
  • Annotate with latency and capacity — label each shard with its size (GB) and query latency (ms) so teams can see when a shard is getting hot.

Visual edits regenerate clean Mermaid code, so you can sketch your topology and adapt it as your infrastructure grows.

Mermaid code

Copy it anywhere Mermaid is supported — GitHub, Notion, or your docs.

flowchart TD
    A[Incoming request with user_id] --> B[Compute hash of user_id]
    B --> C[Modulo by shard count]
    C --> D{Which shard?}
    D -->|Shard 0| E[Query Shard 0 DB]
    D -->|Shard 1| F[Query Shard 1 DB]
    D -->|Shard 2| G[Query Shard 2 DB]
    E --> H[Aggregate result]
    F --> H
    G --> H
    H --> I[Return to client]

Frequently asked questions

What is database sharding and why do teams use it?
Sharding splits a large dataset across multiple database instances so no single database becomes a bottleneck. Each shard owns a slice of data (e.g. users 0–33M on Shard 0, 33M–66M on Shard 1). It allows a team to scale to billions of rows without hitting single-server limits.
How do you decide which shard a request goes to?
The standard approach is hash-based: compute hash(user_id) mod shard_count. This distributes data evenly and is deterministic — the same user always maps to the same shard. Other strategies include range-based (user_id 0–1M on Shard 0) or geographic sharding (US on Shard 0, EU on Shard 1).
What happens when you add or remove a shard?
Adding a shard changes the modulo arithmetic, so most keys map to new shards. This requires expensive data migration. Teams handle this with consistent hashing (reduces remapped keys to 1/N) or accept a planned migration window. Plan shard count generously to minimize reshuffling.
What are the tradeoffs of sharding?
Pros: horizontal scalability, each shard is small and fast. Cons: complex application logic (routing), no cross-shard transactions, operational overhead (backups per shard). Start with a single database, cache to reduce queries, then shard only when single-database replication and read replicas are exhausted.

Related templates