CQRS and Event Sourcing in Rails

Notes on running an order/checkout-style system on rails_event_store and aggregate_root. What’s worth repeating, what’s worth skipping, and the traps that show up in practice.

The two patterns aren’t the same thing

CQRS — separate the write model from the read model. Two paths, often two tables. Event Sourcing — the write model is an append-only event log; current state is reconstructed by replaying events.

You can do CQRS without ES (CRUD writes + an events table feeding read models). Most teams should. Event-sourcing the whole app is the textbook anti-pattern. Apply ES to one bounded context with rich behaviour — Ordering, say — and keep the rest as plain ActiveRecord. That single decision is what tends to save the project.

The signal that a context wants ES: complex state transitions, audit pressure, multiple stakeholders arguing about what happened. Pure CRUD admin screens never need it.

Events are facts, in past tense, kept small

Names: OrderSubmitted, OrderConfirmed, OrderShipped, StockReserved, PaymentCaptured. Past tense, business language, no Order.update. If you find yourself writing OrderUpdated, that’s a sign the actual business event hasn’t been named yet.

Payload rules that hold up over time:

Schema evolution is where this stops being theoretical. Three options, in order of preference:

StrategyWhenCost
Upcasting at read time (RubyEventStore::Mappers::Transformation::Upcast)Adding optional fields, renaming, splitting one event into twoFree at write time, transformation cost on every read
Versioned event types (OrderSubmittedV2)Breaking semantic changeTwo apply clauses in the aggregate forever
In-place mutation via event_store.overwriteHot fix for poisoned dataRewrites history; only as a last resort

In-place mutation is genuinely a last resort. Upcasting transformations live next to the event class and are trivial to test.

Aggregates: AggregateRoot + a state machine + one stream per instance

module Ordering
  class Order
    include AggregateRoot

    AlreadySubmitted = Class.new(StandardError)
    NotSubmitted     = Class.new(StandardError)
    NotConfirmed     = Class.new(StandardError)

    def initialize(id)
      @id, @state = id, :draft
    end

    def submit(order_number:, customer_id:)
      raise AlreadySubmitted unless @state == :draft
      apply OrderSubmitted.new(data: {
        order_id: @id, order_number:, customer_id:
      })
    end

    def confirm
      raise NotSubmitted unless @state == :submitted
      apply OrderConfirmed.new(data: { order_id: @id })
    end

    def ship(tracking_number:)
      raise NotConfirmed unless @state == :confirmed
      apply OrderShipped.new(data: { order_id: @id, tracking_number: })
    end

    on OrderSubmitted do |ev|
      @state, @order_number, @customer_id =
        :submitted, ev.data.fetch(:order_number), ev.data.fetch(:customer_id)
    end
    on OrderConfirmed do |_ev|
      @state = :confirmed
    end
    on OrderShipped do |ev|
      @state, @tracking_number = :shipped, ev.data.fetch(:tracking_number)
    end
  end
end

Two things to internalise:

  1. Public methods enforce invariants and apply events. on EventX blocks mutate state. No getters. The aggregate is not a data container — it’s the only place protecting business rules.
  2. One stream per aggregate instance. Convention: Ordering::Order$#{order_id}. The $ is what the event store browser uses to group streams. Optimistic concurrency is automatic — AggregateRoot::Repository#with_aggregate reads the stream’s current version and writes back with expected_version. Conflicting writers get WrongExpectedEventVersion.

Snapshots (AggregateRoot::SnapshotRepository) once a stream passes ~100 events. Below that, the replay cost is invisible.

CQRS+ES architecture — Command Handler as the central hub, query path closed at the user User Command Query Read Model Command Bus Read DB Abstraction Command Repository Command Handler Event Bus Event Handler Event Event Event Store (write log) Domain Object Aggregates Read Database command path (left) · event flow (centre, →) · query path (right) ↔ = synchronous bidirectional · → = event publication
Same structure as the classical CQRS diagram, with the Event Store standing in for the write database (this post is event-sourced).

Commands and a 10-line bus

The bus is a Hash[CommandClass] = handler. There’s a published gem for it; rolling your own stays under 10 lines and makes the dispatch path obvious:

class CommandBus
  def initialize = @handlers = {}
  def register(command_class, handler) = @handlers[command_class] = handler
  def call(command)
    @handlers.fetch(command.class) { raise "no handler for #{command.class}" }
      .call(command)
  end
end

Then wrap it in RubyEventStore::CorrelatedCommands so events emitted inside a handler inherit the command’s correlation_id and causation_id automatically. That one wrapper is the difference between a readable event store browser and an untraceable mess.

Commands are dry-struct contracts. Validation lives on the command (shape, types, presence). Business rules live on the aggregate (AlreadySubmitted, NotConfirmed). Don’t mix the two — dry-validation for “is this a well-formed request”, aggregate exceptions for “is this allowed right now”.

Read models live in the app, not the domain

The standard term: denormalizer. A subscriber that listens to events and writes a purpose-built ActiveRecord table tailored to one screen.

class OrdersListDenormalizer < ApplicationJob
  prepend RailsEventStore::AsyncHandler

  def perform(event)
    case event
    when Ordering::OrderSubmitted
      OrdersList.create!(order_id: event.data.fetch(:order_id), state: "submitted", ...)
    when Ordering::OrderConfirmed
      OrdersList.where(order_id: event.data.fetch(:order_id)).update_all(state: "confirmed")
    end
  end
end

Each read model gets its own table, owned by the application layer, not the bounded context. Querying is direct AR — no joins to write-side tables, no scopes that pretend to know the domain.

Sync vs async: default async via AfterCommitAsyncDispatcher, which schedules the job only after the surrounding ActiveRecord::Base.transaction commits. This avoids the ugliest bug in event-driven Rails: queueing an order-confirmation email for an order that got rolled back. Use sync only when the read model has to be visible in the same request that produced the command.

Read models are disposable. Truncate, replay events, rebuild. Worth a one-line rake task wired in early — it gets used constantly:

namespace :read_models do
  task rebuild: :environment do
    OrdersList.delete_all
    Rails.configuration.event_store.read.of_type([
      Ordering::OrderSubmitted, Ordering::OrderConfirmed, Ordering::OrderShipped
    ]).each { |e| OrdersListDenormalizer.new.perform(e) }
  end
end

Process managers: the only place cross-context coordination lives

When OrderSubmitted needs to wait for StockReserved and PaymentCaptured before issuing ConfirmOrder, that orchestration goes in a process manager — not in any of the three contexts.

class OrderConfirmationProcess
  def call(event)
    order_id = event.data.fetch(:order_id)
    stream   = "OrderConfirmation$#{order_id}"
    @event_store.link([event.event_id], stream_name: stream, expected_version: :any)

    state = build_state_from(stream)
    @command_bus.call(Ordering::ConfirmOrder.new(order_id:)) if state.ready?
  end
end
Process manager state — events linked from three streams into one source streams derived stream Ordering::Order$o1 OrderSubmitted { order_id, customer_id } Inventory::Reservation$r9 StockReserved { order_id, sku, qty } Payments::Charge$p3 PaymentCaptured { order_id, amount } event_store.link() OrderConfirmation$o1 OrderSubmitted linked from Ordering StockReserved linked from Inventory PaymentCaptured linked from Payments PM rebuilds state on every call command_bus.call(ConfirmOrder) when state.ready? — back into Ordering
Process manager state lives in an event stream, not an AR row. Linking is cheap; the browser shows the whole conversation.

Two patterns to consider, in order:

  1. State in an ActiveRecord row. Migration-heavy. Hard to debug. Skip.
  2. State in its own event stream, built by linking the triggering events into a OrderConfirmation$#{order_id} stream and rebuilding the state object on each call. This is the one that holds up. The browser shows the whole conversation in one place.

The moment an event handler in Ordering calls a method on Inventory, extract a process manager. Rule: direct cross-context method calls are forbidden. Events go out, commands come in, the process manager is the bridge.

Bounded contexts are folders

Bounded-context folder layout — domains, infra, app domains/ — one folder per bounded context ordering/ events · commands · aggregates · Configuration inventory/ stock reservations payments/ authorization · capture pricing/ discounts · totals shipping/ labels · tracking crm/ customer profile infra/ — shared event/command base classes event.rb command.rb types.rb app/ — Rails-side, owned by the application layer read_models/ denormalizers + AR tables, one per screen processes/ orchestration that crosses context lines Cross-context contact = published events only. No domain calls another domain directly.
One Rails monolith, three top-level folders. Zeitwerk disabled inside domains/ so the dependency graph stays explicit.

One Rails monolith. One RailsEventStore::Client. Each context exposes events, commands, aggregates, and a Configuration class that registers handlers in to_prepare. Cross-context contact = published events only.

Disable Zeitwerk inside domains/ and use explicit require_relative. This sounds heretical until you realise it makes the dependency graph between contexts visible in plain text. Zeitwerk’s autoload hides exactly the coupling you’re trying to control.

Microservices are tempting and almost always a mistake here. Folder-level boundaries deliver 95% of the architectural benefit at 5% of the operational cost.

Testing

Three layers, all using rails_event_store-rspec:

# 1. Aggregate — given/when/then
RSpec.describe Ordering::Order do
  it "ships only after confirm" do
    order = described_class.new("o1")
    order.submit(order_number: "X", customer_id: "c1")
    expect { order.ship(tracking_number: "T") }
      .to raise_error(Ordering::Order::NotConfirmed)
  end
end

# 2. Command handler — InMemoryRepository, full slice
RSpec.describe Ordering::OnSubmitOrder do
  let(:event_store) { RailsEventStore::Client.new(repository: RubyEventStore::InMemoryRepository.new) }
  it "publishes OrderSubmitted on the order's stream" do
    described_class.new(repository: Ordering::OrderRepository.new(event_store:))
      .call(Ordering::SubmitOrder.new(order_id: "o1", customer_id: "c1"))
    expect(event_store).to have_published(an_event(Ordering::OrderSubmitted))
      .in_stream("Ordering::Order$o1")
  end
end

# 3. Denormalizer — arrange events, act, assert AR state
RSpec.describe OrdersListDenormalizer do
  it "creates a row on submit" do
    described_class.new.perform(
      Ordering::OrderSubmitted.new(data: { order_id: "o1", order_number: "X", customer_id: "c1" })
    )
    expect(OrdersList.find_by(order_id: "o1").state).to eq("submitted")
  end
end

InMemoryRepository makes aggregate and handler specs run in milliseconds. have_applied for unit tests of aggregates, have_published for integration tests through the event store. Mutation testing (mutant) on the domain code only — not on read models or controllers, where it’s noise.

What goes wrong

MistakeSymptomFix
God Order aggregateEvery change touched the same class; merge conflicts dailySplit into Ordering, Pricing, Shipping aggregates communicating via events
All denormalizers syncp95 latency on POST /orders over a secondprepend RailsEventStore::AsyncHandler, sync only the one read model needed for the redirect
ImmediateAsyncDispatcherConfirmation emails sent for rolled-back ordersSwitch to AfterCommitAsyncDispatcher
Mutating event schemas in placeProduction data interpreted by the wrong code pathUpcasting transformations
No correlation IDs”Why did this email send?” took an hour each timeRubyEventStore::CorrelatedCommands wrapper, LinkByCorrelationId subscriber
Process manager state in ARMigration-coupled to a state machine that kept changingEvent-sourced PM via event_store.link
Anemic aggregates with public stateBusiness logic leaked into handlers and controllersAggregates expose only domain methods; state is private
Registering handlers outside to_prepareStale class references in dev after reloadMove every subscribe and register into config.to_prepare

When not to bother

CQRS+ES is genuinely heavy. A reasonable decision matrix:

TL;DR: Apply ES to one bounded context, not the whole app. Events are past-tense facts with small payloads. Aggregates own invariants and emit events. Read models are disposable AR tables built by async denormalizers. Process managers — and only process managers — coordinate across contexts. Mount the browser, propagate correlation IDs, dispatch async after commit, never mutate events. Test with InMemoryRepository and have_published. The complexity buys you audit, replay, and clarity; if your domain doesn’t need those, stay with ActiveRecord.

← event sourcing