Question 1

What is an ETL pipeline diagram?

Accepted Answer

It shows how data flows from source systems through extract, validation, transform, and quality-check stages into a warehouse, and then fans out to consumers like dashboards and ML features. The two validation gates with a shared dead-letter queue are the key detail — they document that bad records are quarantined for inspection rather than silently dropped or loaded.

Question 2

What do the different node shapes mean in this Mermaid flowchart?

Accepted Answer

The cylinder shape, written as A[(Source systems)], is Mermaid's database notation and marks data stores — here the sources and the warehouse. The double-bracket subroutine shape, D[[Dead-letter queue]], marks a distinct subsystem with its own handling process. Using shapes consistently lets readers distinguish storage, processing, and decisions at a glance without reading every label.

Question 3

Why route failures to a dead-letter queue instead of dropping them?

Accepted Answer

Dropped records disappear; dead-lettered records leave evidence. When schema validation or quality checks fail, quarantining the record preserves it for debugging, replay after a fix, and volume monitoring — a sudden spike in dead-letter traffic is often the first signal that an upstream system changed its schema. Both gates feeding one queue keeps that monitoring in a single place.

Question 4

Does this diagram work for ELT and streaming pipelines too?

Accepted Answer

Yes, with small edits. For ELT, move the transform node after the warehouse to reflect in-warehouse transformation with a tool like dbt. For streaming, relabel extract as the ingestion topic and add a stream processor before the quality gate — the dead-letter pattern and the fan-out to dashboards and ML features carry over unchanged.

Data pipeline (ETL)

When to use this template

How to adapt it

Mermaid code

Frequently asked questions

Related templates

CI/CD pipeline

Feature flag rollout

Incident response runbook