All templates
Flowchart template

Log aggregation pipeline

Collect, parse, enrich, and store logs from multiple services.

Production outages live in the logs, but only if you can find them. This template maps how logs travel from your services to a searchable store: collection agents pull logs from multiple sources, parse and validate them, enrich with context (service name, deployment, user), and finally write to storage where they're queryable for debugging and alerting.

The critical detail is the dead letter queue: malformed logs — from code bugs, config changes, or corrupted data — are separated from the main pipeline so parsing errors don't block the flow or lose data. This separation makes it safe to fix and replay logs without taking down the entire system.

When to use this template

  • Observability architecture review — walk the team through how logs flow from your apps to your log store, identify where logs might drop, and discuss retention and sampling trade-offs.
  • Log volume & cost optimization — annotate each step with entry counts and storage costs. Sampling and dead letter processing are the levers to pull when log bills climb.
  • Incident response runbook — document where to query logs after an incident and how long they're retained. Dead letter queue investigation often uncovers the root cause others missed.

How to adapt it

Customize the collector, parser, and storage to your stack:

  • Replace collector — instead of generic "Log collector agent", name your tool: Fluentd, Fluent Bit, Logstash, CloudWatch agent.
  • Add filtering before enrichment — insert "Apply sampling policy?" to drop verbose logs (debug traces, health checks) and keep expensive (errors, slow requests).
  • Extend enrichment — add steps for PII scrubbing, request tracing (trace ID injection), or correlation IDs so logs from a single user request stay linked.

Visual edits regenerate clean Mermaid code so you can map your real pipeline without syntax overhead.

Mermaid code

Copy it anywhere Mermaid is supported — GitHub, Notion, or your docs.

flowchart TD
    A[App 1 logs] --> B[Log collector agent]
    A1[App 2 logs] --> B
    A2[App 3 logs] --> B
    B --> C[Parse log format]
    C --> D{Valid log?}
    D -->|No| E[Send to dead letter queue]
    D -->|Yes| F[Extract metadata]
    F --> G[Add context tags]
    G --> H[Enrich with service info]
    H --> I[Compress log entry]
    I --> J[Write to storage]
    J --> K{Archive old logs?}
    K -->|Yes| L[Move to cold storage]
    K -->|No| J
    E --> M[Alert on parsing errors]
    M --> N[Send to error index]

Frequently asked questions

What is a log aggregation pipeline?
It's the complete journey of a log entry from creation in your app to searchable storage. Logs flow from multiple services through collection, parsing, validation, enrichment with context (service name, environment, user), compression, and finally indexing or long-term storage. Without a clear pipeline, logs scatter across servers and become impossible to troubleshoot.
Why show the dead letter queue and error branches?
Log collection is only as reliable as its ability to handle malformed or unexpected logs. A dead letter queue catches logs that fail parsing so you can investigate corruption or format changes without losing data or halting the entire pipeline.
How do I add sampling or filtering to reduce log volume?
After 'Extract metadata', insert a decision diamond: 'Include by sampling policy?' If yes, continue to enrichment; if no, drop and count the filtered entry. This lets you keep high-value logs (errors, slow requests) at full fidelity while downsampling verbose debug logs. Visual edits regenerate clean code.
What tools implement log aggregation pipelines?
Common stacks: Fluentd/Fluent Bit (collector) → Elasticsearch (indexing) → Kibana (search/viz); or Datadog/New Relic (all-in-one); or cloud-native: CloudWatch Logs → S3 (AWS), Cloud Logging → BigQuery (Google), Application Insights (Azure). The pipeline structure — collect, parse, enrich, store — is the same.

Related templates