Database migration flow
Safe schema changes with validation, rollback, and production cutover.
Database migrations are the thing that wakes on-call engineers at night. This flowchart breaks down a safe migration from the moment you write the script through dev, staging, production backup, cutover, verification, and the rollback path every team hopes never to use but must always be ready to take.
The diagram makes explicit what separates a boring Monday deployment from an incident: the backup before production, the verification queries after cutover, and a clear decision point where you either commit to the new schema or restore the old one. It also shows why rushing past staging is so expensive — catching a constraint violation or slow query there costs minutes; discovering it in production costs hours and a whole lot of adrenaline.
When to use this template
- Migration planning — walk the team through the intended path before the maintenance window, so everyone knows their role and the rollback trigger.
- Runbooks and playbooks — drop this diagram into your incident response wiki with your actual backup procedure and RTO plugged in.
- Onboarding engineers — new database developers need to see this before their first migration so they understand why each step matters.
How to adapt it
Customize the nodes to match your actual tooling and constraints:
- Replace "Run on dev" with the names of your specific environments (local, CI, integration-test cluster).
- Add a performance test node after staging if you need to validate query plans on realistic data volume.
- Insert a notification node naming the actual channels or teams (Slack, PagerDuty, email) that get alerted at each gate.
Visual edits regenerate clean Mermaid code as you drag and rename, so you can turn this into your actual runbook by just editing the diagram in the editor.
Mermaid code
Copy it anywhere Mermaid is supported — GitHub, Notion, or your docs.
flowchart TD
A[Write migration script] --> B[Run on dev environment]
B --> C{Tests pass?}
C -->|No| D[Fix migration]
D --> B
C -->|Yes| E[Run on staging]
E --> F{Data integrity OK?}
F -->|No| G[Debug schema]
G --> D
F -->|Yes| H[Notify on-call team]
H --> I[Schedule maintenance window]
I --> J[Create database backup]
J --> K[Run on production]
K --> L{Migration succeeded?}
L -->|No| M[Restore from backup]
M --> N[Investigate failure]
L -->|Yes| O[Verify with read queries]
O --> P{Data correct?}
P -->|No| Q[Rollback]
Q --> M
P -->|Yes| R[Monitor for issues]
R --> S[Close migration ticket]
Frequently asked questions
- Why do database migrations require this many steps?
- Because a failed migration can take your entire service offline for hours. Each step — dev, staging, backup, verification — exists to catch breakage before it reaches customers. The rollback path is explicitly drawn because not planning for failure is how you end up at 3 AM without a restore plan.
- What should a database backup step include?
- A full snapshot of all tables, indexes, and constraints, taken immediately before the migration runs. Test the restore on a separate instance to confirm it works — a backup you can't restore is worse than no backup. Most teams use cloud provider snapshots, WAL backups, or logical dumps depending on database size and RTO.
- How long should I wait after a production migration before considering it safe?
- Monitor for at least 30 minutes of normal traffic to catch edge cases your staging tests missed — for example, queries that were slow but didn't fail, or race conditions in application code that only show up under production load. High-transaction databases need longer; low-traffic systems can move faster.
- When should I roll back a production migration?
- If post-deployment queries return wrong data, if error rates spike after cutover, or if the migration itself took longer than your maintenance window allowed. Rollback is your circuit breaker — faster to restore and re-plan than to chase issues in a corrupted state.