All posts
MicroservicesArchitecturePatterns

Microservice communication patterns in Mermaid

6 min readThe MermaidCreator team

Microservice architectures thrive on clear communication patterns. But without a visual contract, ambiguity creeps in: Is the Orders service waiting for Payment, or is it async? Does Inventory talk directly to Shipping, or through a queue? How does the system handle cascading failures? Mermaid diagrams make these patterns explicit and testable.

The three core communication patterns

flowchart LR
    subgraph Sync["Synchronous<br/>(Request-Reply)"]
        A1["Service A"] -->|HTTP/gRPC| B1["Service B"]
        B1 -->|Response| A1
    end
    
    subgraph Async["Asynchronous<br/>(Fire-and-Forget)"]
        A2["Service A"] -->|Message| Queue["Message Queue"]
        Queue -->|Consume| B2["Service B"]
    end
    
    subgraph Hybrid["Choreography<br/>(Event-Driven)"]
        A3["Service A"] -->|Event| EventBus["Event Bus"]
        EventBus -->|Subscribe| B3["Service B"]
        EventBus -->|Subscribe| C3["Service C"]
    end

Each has tradeoffs. This guide shows how to diagram them and choose wisely.

Pattern 1: Synchronous (request-reply)

Best for: Queries, lookups, and operations requiring immediate feedback.

Diagram:

sequenceDiagram
    actor Client
    participant OrderAPI as Order API
    participant PaymentSvc as Payment Service
    participant InventorySvc as Inventory Service
    
    Client->>OrderAPI: POST /orders
    OrderAPI->>InventorySvc: Check stock (item_id)
    InventorySvc-->>OrderAPI: 1000 in stock ✓
    OrderAPI->>PaymentSvc: Charge $99.99
    PaymentSvc-->>OrderAPI: Charged (tx_id: 42)
    OrderAPI-->>Client: Order created (201)

Pros:

  • Simple to reason about: A calls B, waits for reply
  • Built-in error handling: HTTP status codes, timeouts
  • Easy to test: deterministic request-reply

Cons:

  • Blocking: Order API waits for Payment, blocking other orders
  • Tight coupling: If Payment is down, Orders fails immediately
  • Cascading failures: Slow Payment → slow Orders → slow Client

When to use: User-facing APIs, real-time queries, authorization checks.

Pattern 2: Asynchronous (queued messaging)

Best for: Non-blocking work, resilience to downstream failures.

Diagram:

flowchart LR
    Client["Client"] -->|POST /orders| OrderAPI["Order API"]
    OrderAPI -->|Enqueue| Queue["Message Queue<br/>(RabbitMQ, Kafka)"]
    Queue -->|Dequeue| PaymentWorker["Payment Worker"]
    Queue -->|Dequeue| InventoryWorker["Inventory Worker"]
    PaymentWorker -->|Write| DB1["Payment DB"]
    InventoryWorker -->|Write| DB2["Inventory DB"]
    OrderAPI -->|Return 202| Client

Pros:

  • Non-blocking: Order API returns immediately (202 Accepted)
  • Resilient: If Payment is down, message stays in queue; workers pick it up when they recover
  • Scalable: Add more workers to handle load spikes
  • Decoupled: Services don't know about each other

Cons:

  • Eventual consistency: Order not immediately fulfilled; client must poll
  • Harder to debug: Message flow is spread across time and services
  • Exactly-once delivery: Tricky to guarantee without idempotency keys

When to use: Background jobs (email, reports), durable workflows, non-time-sensitive operations.

Pattern 3: Event-driven choreography

Best for: Many services reacting to a single event; complex workflows.

Diagram:

graph LR
    OrderCreated["OrderCreated<br/>Event"]
    OrderAPI["Order API<br/>publishes"]
    
    OrderAPI -->|Emit| EventBus["Event Bus<br/>(pub-sub)"]
    
    EventBus -->|Subscribe| PaymentSvc["Payment Service<br/>charges customer"]
    EventBus -->|Subscribe| InventorySvc["Inventory Service<br/>reserves stock"]
    EventBus -->|Subscribe| NotifSvc["Notification Service<br/>sends confirmation"]
    EventBus -->|Subscribe| AnalyticsSvc["Analytics Service<br/>logs conversion"]
    
    PaymentSvc -->|PaymentSucceeded| EventBus
    InventorySvc -->|StockReserved| EventBus

Pros:

  • Loose coupling: Services emit and listen to events; no direct calls
  • Extensible: New services subscribe to events without changing existing code
  • Observable: Full event history for audit and replay
  • Parallel: Multiple subscribers handle an event concurrently

Cons:

  • Distributed debugging: Tracing a workflow across services is complex
  • Implicit contracts: Subscribers must know event schema; no type safety
  • Saga complexity: Compensating for failures in a choreographed workflow is hard

When to use: E-commerce flows (order → payment → inventory → shipping), user lifecycle events, data sync.

Pattern 4: Orchestration (state machine)

When choreography gets complex, introduce an orchestrator—a service that coordinates the workflow:

stateDiagram-v2
    [*] --> Created
    
    Created --> PendingPayment: Process Order
    
    PendingPayment --> PaymentSucceeded: Charge succeeds
    PendingPayment --> PaymentFailed: Charge fails
    
    PaymentSucceeded --> ReservingInventory: Deduct stock
    ReservingInventory --> InventoryReserved: Stock available
    ReservingInventory --> OutOfStock: Insufficient stock
    
    OutOfStock --> Compensating: Refund payment
    Compensating --> Cancelled
    
    PaymentFailed --> Cancelled
    InventoryReserved --> Shipped
    Shipped --> Completed
    
    Cancelled --> [*]
    Completed --> [*]

The orchestrator (e.g., an Order Saga service) calls each service in sequence and decides what to do on failure.

Pros:

  • Explicit workflow: All steps visible in one place
  • Easy rollback: Orchestrator knows how to compensate each step
  • Debuggable: Watch the state machine's progress

Cons:

  • Central bottleneck: Orchestrator must be highly available
  • Coupling: Orchestrator knows about all services
  • Testing: State machines can have many branches

When to use: Critical multi-step workflows (payment + inventory + shipping), sagas with compensating transactions.

Handling failure modes

Diagram what happens when services fail. This forces you to make assumptions explicit:

flowchart TD
    A["Order API calls<br/>Payment Service"] -->|Timeout| B{Retry?}
    B -->|After 3s| C["Retry 1"]
    C -->|Timeout| D["Retry 2"]
    D -->|Timeout| E["Retry 3"]
    E -->|Still timeout| F["Give up<br/>Return 503"]
    B -->|No| F
    F -->|Client<br/>retries| A

Or, visualize the happy path vs. error paths:

ScenarioFlowRisk
Happy pathOrder → Payment → Inventory → Ship1 in 1,000 orders (0.1% failure)
Payment timeoutRetry, then fallback to asyncClient waits 10+ seconds
Inventory out-of-stockRefund payment, notify customerManual intervention needed
Shipping service downQueue order in backlog; retry hourlyOrders ship 24h late; customer upset

Real-world example: E-commerce checkout

Combining patterns for a resilient flow:

graph LR
    subgraph Sync["Synchronous (real-time)"]
        Client -->|1. POST /checkout| OrderAPI["Order API"]
        OrderAPI -->|2. GET /inventory| Inventory["Inventory Service"]
        Inventory -->|3. Stock?| OrderAPI
    end
    
    subgraph Async["Asynchronous (eventual)"]
        OrderAPI -->|4. Enqueue| Queue["Task Queue"]
        Queue -->|5. Dequeue| PaymentWorker["Payment Worker"]
        PaymentWorker -->|6. Charge| StripeAPI["Stripe API"]
        StripeAPI -->|7. webhook| Webhook["Webhook Handler"]
        Webhook -->|8. Emit| EventBus["Event Bus"]
    end
    
    subgraph Subscribers["Event Subscribers"]
        EventBus -->|9a. Subscribe| ShippingSvc["Shipping Service<br/>(queue label)"]
        EventBus -->|9b. Subscribe| EmailSvc["Email Service<br/>(send receipt)"]
    end
    
    OrderAPI -->|Return 202| Client
    Client -->|Poll| OrderAPI

Flow:

  1. Sync call to check inventory (fast, fail-fast)
  2. Async payment processing (resilient, non-blocking)
  3. Event-driven fulfillment (extensible, scalable)

Diagramming best practices

  • Use sequence diagrams for request-reply (who talks to whom, in order)
  • Use flowcharts for async workflows (what happens in parallel, what's next)
  • Use state machines for orchestration (start → middle states → end)
  • Label edges with timing: OrderAPI -->|100ms timeout| PaymentSvc
  • Color-code by criticality: Sync calls (red), async (blue), fallbacks (orange)
  • Show retry and fallback logic: Don't hide the edge cases
  • Separate happy path from failure paths: Use decision diamonds for explicit branches

FAQ

Should all my services talk async? No—async is great for non-urgent work, but synchronous calls are simpler and faster when immediate feedback is needed. Use sync for the critical path, async for side effects.

How do I diagram a circuit breaker? Use a state machine: Service A → State(Open|Half-Open|Closed) → Service B, with edges for success, timeout, and recovery threshold.

Can I combine patterns? Absolutely—most systems use all three. Sync for user-facing APIs, async for background work, events for data sync across services. Diagram each pattern separately, then show how they connect.

What if I have no async infrastructure yet? Start with synchronous calls (easiest to implement), then add async when you hit scale or resilience issues. Diagram the current state and the target state, then plan the migration.

Ready to visualize your service architecture? Use MermaidCreator's editor to sketch your patterns, share them with your team, and iterate before building.

Related posts