All posts
MermaidSankey DiagramsData Visualization

Mermaid Sankey diagrams: visualize data flow and dependencies

5 min readThe MermaidCreator team

When you need to show how much of something moves from A to B to C, a regular flowchart falls flat. Sankey diagrams fix this: the width of each arrow represents the volume flowing through it. They're the standard way to visualize energy flows, cost allocation, user journeys, and data dependencies — anywhere the magnitude matters as much as the path.

Why Sankey diagrams work

A Sankey diagram assigns a numeric value to each connection. Readers immediately see where the bulk of traffic or resources goes, and where bottlenecks form. Compare reading "the company allocated $500k to infrastructure, $300k to engineering, and $200k to sales" to a single Sankey showing those flows and how they break down further.

Sankey diagrams excel at:

  • Dependency mapping — trace which components depend on what infrastructure
  • Cost/budget breakdowns — show where money flows through an organization
  • User journey analysis — see what fraction of users take each path
  • Data pipeline tracing — visualize how records flow through ETL stages
  • Energy/resource flows — the original use case, still unbeaten

Sankey syntax in Mermaid

A Sankey diagram in Mermaid starts with sankey-beta (the beta label is honest — it's stable but newer). Each line maps a source, target, and numeric value:

sankey-beta

Source,Target,Value
Product Sales,Gross Margin,10000
Product Sales,Cost of Goods,6000
Gross Margin,Operating Expenses,4000
Gross Margin,Net Income,6000
Operating Expenses,Salaries,2500
Operating Expenses,Infrastructure,1200
Operating Expenses,Marketing,300

The diagram automatically sizes flows proportionally and arranges nodes to minimize crossings. You can chain flows (A → B → C) and use any text for node labels.

A practical example: cloud infrastructure costs

Let's map how a SaaS company's cloud bill breaks down from total spend through services to specific cost drivers:

sankey-beta

Total Cloud Spend,Compute,35000
Total Cloud Spend,Storage,15000
Total Cloud Spend,Networking,8000
Total Cloud Spend,Database,12000
Compute,EC2 On-Demand,20000
Compute,EC2 Reserved,10000
Compute,Lambda,5000
Storage,S3,12000
Storage,Backup,3000
Database,RDS Multi-AZ,10000
Database,DynamoDB,2000
Networking,Data Transfer,5000
Networking,CloudFront,3000

This immediately shows that EC2 on-demand is the biggest lever for cost optimization — and that database choices matter much less. The Sankey makes that obvious without a spreadsheet.

Building Sankey diagrams from data

If your data lives in a spreadsheet, you can generate Mermaid Sankey syntax programmatically:

Python example:

def to_sankey(rows, source_col, target_col, value_col):
    lines = ["sankey-beta", ""]
    lines.append(f"{source_col},{target_col},{value_col}")
    for row in rows:
        lines.append(f"{row[source_col]},{row[target_col]},{row[value_col]}")
    return "\n".join(lines)

JavaScript example:

const toSankey = (rows, sourceName, targetName, valueName) => {
  const lines = ["sankey-beta", ""];
  lines.push(`${sourceName},${targetName},${valueName}`);
  rows.forEach(row => {
    lines.push(`${row[sourceName]},${row[targetName]},${row[valueName]}`);
  });
  return lines.join("\n");
};

This lets you keep the definition in a database and re-render it whenever the data updates.

Common Sankey patterns

Multi-level breakdowns: Use intermediate nodes to show progressive detail:

sankey-beta

Revenue,Product A,50000
Revenue,Product B,30000
Product A,Engineering,30000
Product A,Sales,20000
Product B,Engineering,18000
Product B,Sales,12000

Circular flows / feedback loops: Sankey doesn't handle true cycles (A → B → A), but you can show flows that reconverge at a node. If you need actual loops, a state diagram or directed graph may fit better.

Filtering data for clarity: If you have hundreds of flows, show only the top 10–15 by value. Group the rest into an "Other" node. This keeps the diagram scannable.

Sankey vs. other diagrams

Use caseSankeyFlowchartClass/ER Diagram
Show volume or magnitude
Trace a single path✓✓
Map relationships✓✓
Handle large datasets
Show hierarchies✓✓

Use Sankey when magnitude matters. For pure structure, stick to flowcharts or entity diagrams.

Tips for readable Sankey diagrams

Order nodes consistently. If your Sankey spans a timeline (left to right) or hierarchy (top to bottom), arrange nodes so they follow a logical sequence. Mermaid tries to minimize line crossings, but you can influence it by ordering the source lines.

Use names that fit. Long node labels cause the diagram to expand or overlap. If a label is longer than 20 characters, shorten it and add explanation in adjacent prose.

Limit flow depth. A Sankey that goes 5 levels deep becomes unreadable. If you need that level of detail, split into separate diagrams — one for the top 3 levels, another for the detail below.

Highlight the interesting flows. If the diagram is meant to show "engineering costs are underestimated," add a callout in the text. The diagram shows the fact; the prose explains the implication.

Building and exporting

Draft your Sankey in the MermaidCreator editor — paste the CSV-like syntax and watch the diagram render. Adjust node order and labels until it's clear, then copy the Mermaid code into your repo, documentation, or presentation.

Like all Mermaid diagrams, Sankey renders natively on GitHub, GitLab, and Notion, so no screenshot exports needed — the diagram lives alongside the code and updates with it.

FAQ

Q: Can I use Sankey for qualitative flows (no numbers)? A: Sankey requires numeric values for the width of each arrow. If you want to show pure paths without magnitude, use a flowchart instead.

Q: How do I handle negative flows (refunds, reversals)? A: Mermaid Sankey doesn't support negative values. Model refunds as a separate flow from the main stream (e.g., Sales → Returns, Returns → Refunds).

Q: Can I color code the flows? A: Not yet in the syntax itself, but some rendering tools support adding colors in the output. Check the Mermaid docs for the latest on theming.

Try building a Sankey in the editor — grab some real data and see what flows pop out.

Related posts