CQRS and Event Sourcing in Rails

2026-05-07

Notes on running an order/checkout-style system on rails_event_store and aggregate_root. What’s worth repeating, what’s worth skipping, and the traps that show up in practice.

The two patterns aren’t the same thing

CQRS — separate the write model from the read model. Two paths, often two tables. Event Sourcing — the write model is an append-only event log; current state is reconstructed by replaying events.

You can do CQRS without ES (CRUD writes + an events table feeding read models). Most teams should. Event-sourcing the whole app is the textbook anti-pattern. Apply ES to one bounded context with rich behaviour — Ordering, say — and keep the rest as plain ActiveRecord. That single decision is what tends to save the project.

The signal that a context wants ES: complex state transitions, audit pressure, multiple stakeholders arguing about what happened. Pure CRUD admin screens never need it.

Events are facts, in past tense, kept small

Names: OrderSubmitted, OrderConfirmed, OrderShipped, StockReserved, PaymentCaptured. Past tense, business language, no Order.update. If you find yourself writing OrderUpdated, that’s a sign the actual business event hasn’t been named yet.

Payload rules that hold up over time:

Identifiers + the minimum facts. OrderShipped { order_id, tracking_number }. Not the whole order. The aggregate already has the rest.
Internal vs public events are different things. Events used to source aggregate state are private to the bounded context. Events crossing context boundaries are contracts and should change reluctantly.
correlation_id and causation_id in metadata, always. Without them, the event store browser is unreadable. With them, you can click from a SubmitOrder command through every event and command it caused.

Schema evolution is where this stops being theoretical. Three options, in order of preference:

Strategy	When	Cost
Upcasting at read time (`RubyEventStore::Mappers::Transformation::Upcast`)	Adding optional fields, renaming, splitting one event into two	Free at write time, transformation cost on every read
Versioned event types (`OrderSubmittedV2`)	Breaking semantic change	Two `apply` clauses in the aggregate forever
In-place mutation via `event_store.overwrite`	Hot fix for poisoned data	Rewrites history; only as a last resort

In-place mutation is genuinely a last resort. Upcasting transformations live next to the event class and are trivial to test.

Aggregates: AggregateRoot + a state machine + one stream per instance

module Ordering
  class Order
    include AggregateRoot

    AlreadySubmitted = Class.new(StandardError)
    NotSubmitted     = Class.new(StandardError)
    NotConfirmed     = Class.new(StandardError)

    def initialize(id)
      @id, @state = id, :draft
    end

    def submit(order_number:, customer_id:)
      raise AlreadySubmitted unless @state == :draft
      apply OrderSubmitted.new(data: {
        order_id: @id, order_number:, customer_id:
      })
    end

    def confirm
      raise NotSubmitted unless @state == :submitted
      apply OrderConfirmed.new(data: { order_id: @id })
    end

    def ship(tracking_number:)
      raise NotConfirmed unless @state == :confirmed
      apply OrderShipped.new(data: { order_id: @id, tracking_number: })
    end

    on OrderSubmitted do |ev|
      @state, @order_number, @customer_id =
        :submitted, ev.data.fetch(:order_number), ev.data.fetch(:customer_id)
    end
    on OrderConfirmed do |_ev|
      @state = :confirmed
    end
    on OrderShipped do |ev|
      @state, @tracking_number = :shipped, ev.data.fetch(:tracking_number)
    end
  end
end

Two things to internalise:

Public methods enforce invariants and apply events. on EventX blocks mutate state. No getters. The aggregate is not a data container — it’s the only place protecting business rules.
One stream per aggregate instance. Convention: Ordering::Order$#{order_id}. The $ is what the event store browser uses to group streams. Optimistic concurrency is automatic — AggregateRoot::Repository#with_aggregate reads the stream’s current version and writes back with expected_version. Conflicting writers get WrongExpectedEventVersion.

Snapshots (AggregateRoot::SnapshotRepository) once a stream passes ~100 events. Below that, the replay cost is invisible.

Same structure as the classical CQRS diagram, with the Event Store standing in for the write database (this post is event-sourced).

Commands and a 10-line bus

The bus is a Hash[CommandClass] = handler. There’s a published gem for it; rolling your own stays under 10 lines and makes the dispatch path obvious:

class CommandBus
  def initialize = @handlers = {}
  def register(command_class, handler) = @handlers[command_class] = handler
  def call(command)
    @handlers.fetch(command.class) { raise "no handler for #{command.class}" }
      .call(command)
  end
end

Then wrap it in RubyEventStore::CorrelatedCommands so events emitted inside a handler inherit the command’s correlation_id and causation_id automatically. That one wrapper is the difference between a readable event store browser and an untraceable mess.

Commands are dry-struct contracts. Validation lives on the command (shape, types, presence). Business rules live on the aggregate (AlreadySubmitted, NotConfirmed). Don’t mix the two — dry-validation for “is this a well-formed request”, aggregate exceptions for “is this allowed right now”.

Read models live in the app, not the domain

The standard term: denormalizer. A subscriber that listens to events and writes a purpose-built ActiveRecord table tailored to one screen.

class OrdersListDenormalizer < ApplicationJob
  prepend RailsEventStore::AsyncHandler

  def perform(event)
    case event
    when Ordering::OrderSubmitted
      OrdersList.create!(order_id: event.data.fetch(:order_id), state: "submitted", ...)
    when Ordering::OrderConfirmed
      OrdersList.where(order_id: event.data.fetch(:order_id)).update_all(state: "confirmed")
    end
  end
end

Each read model gets its own table, owned by the application layer, not the bounded context. Querying is direct AR — no joins to write-side tables, no scopes that pretend to know the domain.

Sync vs async: default async via AfterCommitAsyncDispatcher, which schedules the job only after the surrounding ActiveRecord::Base.transaction commits. This avoids the ugliest bug in event-driven Rails: queueing an order-confirmation email for an order that got rolled back. Use sync only when the read model has to be visible in the same request that produced the command.

Read models are disposable. Truncate, replay events, rebuild. Worth a one-line rake task wired in early — it gets used constantly:

namespace :read_models do
  task rebuild: :environment do
    OrdersList.delete_all
    Rails.configuration.event_store.read.of_type([
      Ordering::OrderSubmitted, Ordering::OrderConfirmed, Ordering::OrderShipped
    ]).each { |e| OrdersListDenormalizer.new.perform(e) }
  end
end

Process managers: the only place cross-context coordination lives

When OrderSubmitted needs to wait for StockReserved and PaymentCaptured before issuing ConfirmOrder, that orchestration goes in a process manager — not in any of the three contexts.

class OrderConfirmationProcess
  def call(event)
    order_id = event.data.fetch(:order_id)
    stream   = "OrderConfirmation$#{order_id}"
    @event_store.link([event.event_id], stream_name: stream, expected_version: :any)

    state = build_state_from(stream)
    @command_bus.call(Ordering::ConfirmOrder.new(order_id:)) if state.ready?
  end
end

Process manager state lives in an event stream, not an AR row. Linking is cheap; the browser shows the whole conversation.

Two patterns to consider, in order:

State in an ActiveRecord row. Migration-heavy. Hard to debug. Skip.
State in its own event stream, built by linking the triggering events into a OrderConfirmation$#{order_id} stream and rebuilding the state object on each call. This is the one that holds up. The browser shows the whole conversation in one place.

The moment an event handler in Ordering calls a method on Inventory, extract a process manager. Rule: direct cross-context method calls are forbidden. Events go out, commands come in, the process manager is the bridge.

Bounded contexts are folders

One Rails monolith, three top-level folders. Zeitwerk disabled inside domains/ so the dependency graph stays explicit.

One Rails monolith. One RailsEventStore::Client. Each context exposes events, commands, aggregates, and a Configuration class that registers handlers in to_prepare. Cross-context contact = published events only.

Disable Zeitwerk inside domains/ and use explicit require_relative. This sounds heretical until you realise it makes the dependency graph between contexts visible in plain text. Zeitwerk’s autoload hides exactly the coupling you’re trying to control.

Microservices are tempting and almost always a mistake here. Folder-level boundaries deliver 95% of the architectural benefit at 5% of the operational cost.

Testing

Three layers, all using rails_event_store-rspec:

# 1. Aggregate — given/when/then
RSpec.describe Ordering::Order do
  it "ships only after confirm" do
    order = described_class.new("o1")
    order.submit(order_number: "X", customer_id: "c1")
    expect { order.ship(tracking_number: "T") }
      .to raise_error(Ordering::Order::NotConfirmed)
  end
end

# 2. Command handler — InMemoryRepository, full slice
RSpec.describe Ordering::OnSubmitOrder do
  let(:event_store) { RailsEventStore::Client.new(repository: RubyEventStore::InMemoryRepository.new) }
  it "publishes OrderSubmitted on the order's stream" do
    described_class.new(repository: Ordering::OrderRepository.new(event_store:))
      .call(Ordering::SubmitOrder.new(order_id: "o1", customer_id: "c1"))
    expect(event_store).to have_published(an_event(Ordering::OrderSubmitted))
      .in_stream("Ordering::Order$o1")
  end
end

# 3. Denormalizer — arrange events, act, assert AR state
RSpec.describe OrdersListDenormalizer do
  it "creates a row on submit" do
    described_class.new.perform(
      Ordering::OrderSubmitted.new(data: { order_id: "o1", order_number: "X", customer_id: "c1" })
    )
    expect(OrdersList.find_by(order_id: "o1").state).to eq("submitted")
  end
end

InMemoryRepository makes aggregate and handler specs run in milliseconds. have_applied for unit tests of aggregates, have_published for integration tests through the event store. Mutation testing (mutant) on the domain code only — not on read models or controllers, where it’s noise.

What goes wrong

Mistake	Symptom	Fix
God Order aggregate	Every change touched the same class; merge conflicts daily	Split into Ordering, Pricing, Shipping aggregates communicating via events
All denormalizers sync	p95 latency on `POST /orders` over a second	`prepend RailsEventStore::AsyncHandler`, sync only the one read model needed for the redirect
`ImmediateAsyncDispatcher`	Confirmation emails sent for rolled-back orders	Switch to `AfterCommitAsyncDispatcher`
Mutating event schemas in place	Production data interpreted by the wrong code path	Upcasting transformations
No correlation IDs	”Why did this email send?” took an hour each time	`RubyEventStore::CorrelatedCommands` wrapper, `LinkByCorrelationId` subscriber
Process manager state in AR	Migration-coupled to a state machine that kept changing	Event-sourced PM via `event_store.link`
Anemic aggregates with public state	Business logic leaked into handlers and controllers	Aggregates expose only domain methods; state is private
Registering handlers outside `to_prepare`	Stale class references in dev after reload	Move every `subscribe` and `register` into `config.to_prepare`

When not to bother

CQRS+ES is genuinely heavy. A reasonable decision matrix:

Use it when you have rich business workflows, audit obligations, or temporal queries (“what was this order’s state on March 12?”). Orders, invoices, claims, bookings, learning processes.
Use only CQRS (read models from CRUD writes) when reads diverge from writes but the write side is mostly form-fills.
Use neither for reference data, admin screens, single-team CRUD, or anything where the team isn’t ready to commit to past-tense event names and ubiquitous language. The patterns punish half-measures.

TL;DR: Apply ES to one bounded context, not the whole app. Events are past-tense facts with small payloads. Aggregates own invariants and emit events. Read models are disposable AR tables built by async denormalizers. Process managers — and only process managers — coordinate across contexts. Mount the browser, propagate correlation IDs, dispatch async after commit, never mutate events. Test with InMemoryRepository and have_published. The complexity buys you audit, replay, and clarity; if your domain doesn’t need those, stay with ActiveRecord.

← event sourcing