CQRS + event sourcing — plane tickets from multiple vendors
A single-file in-browser visualizer of how a CQRS + event-sourcing ticketing platform actually behaves under load and under failure. Rails on the command and query sides, PostgreSQL as the event store and read DB, RabbitMQ between them, Redis in front, three GDS-style vendors (Iberia / Lufthansa / Delta on NDC, Amadeus and Sabre respectively), a saga orchestrator wired through explicit compensations.
The default script walks the canonical edge cases — the kind of thing you discover the hard way in production:
boot → 5 projectors, 3 vendors, all queues empty
search (cache miss) → Redis single-flight populator → vendor fan-out → cache hit
hold seat → HoldSeat → Booking aggregate v0 → v1 SeatHeld
pay → capture → issue → v1 → v2 → v3, projection lag visible briefly
concurrent buyer race → two HoldSeats for the last seat; one wins by version
vendor timeout → 5 s injected on V2 → saga timeout (2.5 s) → release hold
payment OK, issue fails → V3 5xx → refund saga (initiate → complete)
out-of-order delivery → projector buffers the late event then applies
hold expires mid-payment → timer fires SeatHoldExpired before pay → reject
at-least-once redelivery → RabbitMQ dups → idempotency log skips on (agg, ver)
multi-vendor saga → V1 leg holds, V2 leg fails → compensate V1
Redis cache stampede → 200 concurrent gets → 1 populator (single-flight)
read-your-own-writes → Redis write-through serves the just-booked customer
GDPR tombstone delete → no destructive delete; PII redacted in projections
projection replay → drop CustomerBookings, rebuild from seq 0
burst load → 50 cmd/s; event store grows, lag balloons + drains
What this actually is (for someone who’s never built CQRS)
A regular CRUD web app stores the current state of every record. If a customer changes their address, you UPDATE the address column — the old value is gone. That’s fine for boring apps. It is not fine for a plane-ticket platform where you have to answer questions like “why did this booking end up refunded?” and “did the customer pay before or after the vendor confirmed the seat?” a year later. UPDATE loses the story.
Event sourcing says: don’t store the current state, store the history of
events that led to it. SeatHeld, PaymentAuthorized, PaymentCaptured,
TicketIssued — each one append-only, immutable, timestamped, signed. The
current state is just a function over the history: fold(events) → state.
CQRS (Command Query Responsibility Segregation) then says: the path you write on and the path you read on don’t have to be the same shape. Writes go through aggregates — tiny consistency boundaries (one booking at a time) that enforce business rules and append events. Reads go through projections — denormalised views, rebuilt by replaying events off a bus. The hard part isn’t the pattern; the hard part is everything that goes wrong between “I appended an event” and “the read side knows about it”:
- The event-store commit succeeds but the publish to the bus fails. (Outbox pattern.)
- The bus delivers the event twice. (Idempotency log keyed on
(aggregate_id, version).) - The bus delivers events out of order across multiple aggregates. (For per-aggregate-ordered projections, buffer and reapply.)
- A long-running business transaction (book the outbound leg on Iberia, the inbound on Delta) needs to partially undo on failure. (Saga with explicit compensations — there are no distributed locks.)
- The read side is “eventually” consistent. The customer just clicked Buy and refreshes their My Bookings page; nothing’s there yet. (Read your own writes via a write-through Redis layer.)
- A flash-sale hits the cache the millisecond it expires. 200 requests miss simultaneously. (Single-flight populator lock — only one of them runs the vendor query.)
- GDPR says “delete this customer.” But the event log is the source of
truth and you can’t delete history. (Append a
GDPRTombstoneevent — projectors redact PII, the audit trail keeps the event hash.)
This simulator surfaces every one of those situations as a scripted scenario, with the relevant numbers visible. Toggle the failure-injection knobs in the toolbar to fire any of them on demand.
Views
| View | What it shows |
|---|---|
| System | Top-down architecture diagram: Customer → Rails Command API → Booking Aggregate → PostgreSQL event store → Outbox → RabbitMQ → 5 projectors → Read DB. Query path: Customer → Rails Query API → Redis → Read DB. Saga orchestrator + 3 vendor adapters across the bottom. Particles flow between boxes on every command, event, projection. Hover anything for an explanation. |
| Command | All aggregates with current version and state (status, vendor, flight, customer, PNR). Command log on the right with inflight / ok / conflict / rejected outcomes. Saga state-machine list with current step and log preview. |
| Events | The append-only event store: seq, aggregate id, version, type, payload preview, timestamp. Click any event for the full payload, vendor / saga refs, and which projectors have processed it. |
| Queries | Each projector with its queue, lag (events behind), last-applied seq, processed count, idempotency log size. Per-projection view-state row count. Redis entries below with TTL, populating-flag, and waiter-count (single-flight). |
| Vendors | Per-vendor (V1 Iberia / V2 Lufthansa / V3 Delta) inventory chips coloured by seat count. Saga timeline panel showing the latest sagas with their per-step log entries (ok, fail, compensate ok) — watch a multi-vendor booking unwind. |
What’s “precise” about it
- Real append-only event store. Every state change is an event with
(aggregate_id, version)as the primary key.versionisexpected + 1; conflict means “someone else got there first” — exactly what PostgreSQL would do withUPDATE ... WHERE version = ?returning zero rows. - Real outbox pattern. Events are written to the event store and a publish queue in the same transaction. A separate pumper drains the queue to RabbitMQ. This is the only correct way to avoid a “wrote but didn’t publish” or “published but didn’t write” gap.
- Real fan-out at the bus. RabbitMQ here is a topic exchange with one
queue per projector. Each queue is independent — slow projectors don’t
block fast ones. At-least-once delivery means duplicates happen; the
idempotency log catches them on
(aggregate_id, version). - Real projector ordering. AvailableFlights and CustomerBookings tolerate out-of-order delivery. BookingDetail does not — it buffers events whose version is “ahead” until the missing version arrives.
- Real saga compensations. Each step has an explicit
compensate(...)function. A failure at step n runs the compensations of n−1, n−2, …, 0 in reverse order. Compensations are idempotent — a saga that crashed halfway can resume. - Real Redis single-flight. A
populating: trueflag on the cache entry plus a waiter list means concurrent misses collapse to one populator call; the rest receive the result when the populator returns. Without this, a hot key + flash sale = vendor API meltdown. - Real GDPR tombstone. No
DELETEs anywhere. AGDPRTombstoneevent appended at the end of the aggregate stream causes theCustomerBookingsprojector to drop the row and theBookingDetailprojector to redactcustomerwhile keeping the per-event hash for audit. - Real projection replay. The defining capability of event sourcing —
drop the materialised view, replay events from
seq 0, end up with the same view. The simulator does this forCustomerBookingson the “Replay” knob.
Controls
| Key / mouse | Action |
|---|---|
space | Run / pause the script |
s | Step one engine tick (50 ms simulated) |
shift+s | Step ×10 |
r | Reset (clear event store, projections, vendors, sagas) |
v | Cycle System → Command → Events → Queries → Vendors |
| click an event | Pin its full payload + processing status in the detail pane |
| hover anything | Context-aware explanation (HTML tooltips on toolbar / canvas tooltips on every region) |
| toolbar knobs | Slow V2 · Fail issue · Dup deliver · Stampede · Burst · Replay |
References
The patterns here are old and well-documented:
- Event sourcing + CQRS: Greg Young’s original CQRS docs, Vaughn Vernon Implementing Domain-Driven Design, Microsoft’s CQRS Journey.
- Sagas: Hector Garcia-Molina & Kenneth Salem, Sagas (1987), and the modern “long-lived business transaction with explicit compensations” formulation from Caitie McCaffrey and Pat Helland.
- Outbox pattern: Chris Richardson, Microservices Patterns, ch. 3.
- At-least-once + idempotency: Martin Kleppmann, Designing Data-Intensive Applications, ch. 8 + 11.
- Single-flight: the
singleflightGo package, but the technique predates it. - Optimistic concurrency: Bernstein, Hadzilacos, Goodman, Concurrency Control and Recovery in Database Systems (1987) — chapter 5.
- Travel-industry context: IATA NDC v19.2 schemas, Amadeus Self-Service APIs, Sabre SOAP/REST hybrid, the “GDS” mental model that the vendor adapter layer normalises away.
How it works under the hood
The simulator is a single HTML file with embedded CSS and JavaScript — no
build, no server, no toolchain. The engine state (aggregates, event store,
RabbitMQ queues, projectors, Redis, vendors, sagas, command log, timers)
lives inside a small IIFE that exposes send({ cmd }) and getState(). The
visualizer is a requestAnimationFrame loop wrapped in try/catch so a single
regression can’t lock up the UI: every frame it reads the engine state and
routes to one of five drawX() functions over a 1180 × 700 logical canvas.
Particles on the System view are spawned by the engine’s event subscription — every command, event publish and projection apply triggers one or two particles along the appropriate wire — giving you a literal picture of the read/write paths.