Mermaid Sankey diagrams: visualize data flow and dependencies
When you need to show how much of something moves from A to B to C, a regular flowchart falls flat. Sankey diagrams fix this: the width of each arrow represents the volume flowing through it. They're the standard way to visualize energy flows, cost allocation, user journeys, and data dependencies — anywhere the magnitude matters as much as the path.
Why Sankey diagrams work
A Sankey diagram assigns a numeric value to each connection. Readers immediately see where the bulk of traffic or resources goes, and where bottlenecks form. Compare reading "the company allocated $500k to infrastructure, $300k to engineering, and $200k to sales" to a single Sankey showing those flows and how they break down further.
Sankey diagrams excel at:
- Dependency mapping — trace which components depend on what infrastructure
- Cost/budget breakdowns — show where money flows through an organization
- User journey analysis — see what fraction of users take each path
- Data pipeline tracing — visualize how records flow through ETL stages
- Energy/resource flows — the original use case, still unbeaten
Sankey syntax in Mermaid
A Sankey diagram in Mermaid starts with sankey-beta (the beta label is honest — it's stable but newer). Each line maps a source, target, and numeric value:
sankey-beta
Source,Target,Value
Product Sales,Gross Margin,10000
Product Sales,Cost of Goods,6000
Gross Margin,Operating Expenses,4000
Gross Margin,Net Income,6000
Operating Expenses,Salaries,2500
Operating Expenses,Infrastructure,1200
Operating Expenses,Marketing,300
The diagram automatically sizes flows proportionally and arranges nodes to minimize crossings. You can chain flows (A → B → C) and use any text for node labels.
A practical example: cloud infrastructure costs
Let's map how a SaaS company's cloud bill breaks down from total spend through services to specific cost drivers:
sankey-beta
Total Cloud Spend,Compute,35000
Total Cloud Spend,Storage,15000
Total Cloud Spend,Networking,8000
Total Cloud Spend,Database,12000
Compute,EC2 On-Demand,20000
Compute,EC2 Reserved,10000
Compute,Lambda,5000
Storage,S3,12000
Storage,Backup,3000
Database,RDS Multi-AZ,10000
Database,DynamoDB,2000
Networking,Data Transfer,5000
Networking,CloudFront,3000
This immediately shows that EC2 on-demand is the biggest lever for cost optimization — and that database choices matter much less. The Sankey makes that obvious without a spreadsheet.
Building Sankey diagrams from data
If your data lives in a spreadsheet, you can generate Mermaid Sankey syntax programmatically:
Python example:
def to_sankey(rows, source_col, target_col, value_col):
lines = ["sankey-beta", ""]
lines.append(f"{source_col},{target_col},{value_col}")
for row in rows:
lines.append(f"{row[source_col]},{row[target_col]},{row[value_col]}")
return "\n".join(lines)
JavaScript example:
const toSankey = (rows, sourceName, targetName, valueName) => {
const lines = ["sankey-beta", ""];
lines.push(`${sourceName},${targetName},${valueName}`);
rows.forEach(row => {
lines.push(`${row[sourceName]},${row[targetName]},${row[valueName]}`);
});
return lines.join("\n");
};
This lets you keep the definition in a database and re-render it whenever the data updates.
Common Sankey patterns
Multi-level breakdowns: Use intermediate nodes to show progressive detail:
sankey-beta
Revenue,Product A,50000
Revenue,Product B,30000
Product A,Engineering,30000
Product A,Sales,20000
Product B,Engineering,18000
Product B,Sales,12000
Circular flows / feedback loops: Sankey doesn't handle true cycles (A → B → A), but you can show flows that reconverge at a node. If you need actual loops, a state diagram or directed graph may fit better.
Filtering data for clarity: If you have hundreds of flows, show only the top 10–15 by value. Group the rest into an "Other" node. This keeps the diagram scannable.
Sankey vs. other diagrams
| Use case | Sankey | Flowchart | Class/ER Diagram |
|---|---|---|---|
| Show volume or magnitude | ✓ | ✗ | ✗ |
| Trace a single path | ✓ | ✓✓ | ✗ |
| Map relationships | ✓ | ✓ | ✓✓ |
| Handle large datasets | ✗ | ✓ | ✓ |
| Show hierarchies | ✗ | ✓✓ | ✓ |
Use Sankey when magnitude matters. For pure structure, stick to flowcharts or entity diagrams.
Tips for readable Sankey diagrams
Order nodes consistently. If your Sankey spans a timeline (left to right) or hierarchy (top to bottom), arrange nodes so they follow a logical sequence. Mermaid tries to minimize line crossings, but you can influence it by ordering the source lines.
Use names that fit. Long node labels cause the diagram to expand or overlap. If a label is longer than 20 characters, shorten it and add explanation in adjacent prose.
Limit flow depth. A Sankey that goes 5 levels deep becomes unreadable. If you need that level of detail, split into separate diagrams — one for the top 3 levels, another for the detail below.
Highlight the interesting flows. If the diagram is meant to show "engineering costs are underestimated," add a callout in the text. The diagram shows the fact; the prose explains the implication.
Building and exporting
Draft your Sankey in the MermaidCreator editor — paste the CSV-like syntax and watch the diagram render. Adjust node order and labels until it's clear, then copy the Mermaid code into your repo, documentation, or presentation.
Like all Mermaid diagrams, Sankey renders natively on GitHub, GitLab, and Notion, so no screenshot exports needed — the diagram lives alongside the code and updates with it.
FAQ
Q: Can I use Sankey for qualitative flows (no numbers)? A: Sankey requires numeric values for the width of each arrow. If you want to show pure paths without magnitude, use a flowchart instead.
Q: How do I handle negative flows (refunds, reversals)?
A: Mermaid Sankey doesn't support negative values. Model refunds as a separate flow from the main stream (e.g., Sales → Returns, Returns → Refunds).
Q: Can I color code the flows? A: Not yet in the syntax itself, but some rendering tools support adding colors in the output. Check the Mermaid docs for the latest on theming.
Try building a Sankey in the editor — grab some real data and see what flows pop out.
Related posts
Visualizing metrics with Mermaid pie and bar chart diagrams
Use Mermaid's pie and bar chart syntax to show proportions and trends—no external charting library needed. A complete guide with examples.
C4 model diagrams in Mermaid for system architecture documentation
The C4 model (Context, Container, Component, Code) is a scalable way to document architecture at four levels of detail — Mermaid supports it natively.