Question 1

What is an incident response runbook diagram?

Accepted Answer

It is the decision tree an on-call engineer follows when an alert fires: confirm it is real, assess severity, page and open an incident if it is high, mitigate, escalate if mitigation stalls, then fix the root cause and run a post-mortem. Having it as a diagram means a stressed responder at 3 a.m. can follow arrows instead of re-reading paragraphs.

Question 2

Why separate mitigation from the root cause fix?

Accepted Answer

Because they have different goals and different clocks. Mitigation stops customer impact fast — roll back, fail over, scale up — even if you do not yet understand the bug. The root cause fix comes after, calmly, with the pressure off. Runbooks that conflate the two encourage engineers to debug in production while users are down, which lengthens every incident.

Question 3

How do severity levels fit into an incident flowchart?

Accepted Answer

This template uses a single Low/High split to keep the triage decision fast: low severity becomes a sprint ticket, high severity pages the on-call. If your organization uses SEV1–SEV4, replace the diamond's two branches with one per level, each routing to its own response — but resist adding levels that do not change who gets paged or how fast.

Question 4

Where should an incident runbook diagram live?

Accepted Answer

Wherever your on-call looks first under pressure: the alert annotation itself, your monitoring tool's runbook link, or the top of the incident channel topic. Keep the Mermaid source in version control next to your alerting config so changes to the process go through review, and the rendered diagram can never silently diverge from it.

Incident response runbook

When to use this template

How to adapt it

Mermaid code

Frequently asked questions

Related templates

CI/CD pipeline

Data pipeline (ETL)

Feature flag rollout