CQRS and Event Sourcing in Rails
Notes on running an order/checkout-style system on rails_event_store and aggregate_root. What’s worth repeating, what’s worth skipping, and the traps that show up in practice.
The two patterns aren’t the same thing
CQRS — separate the write model from the read model. Two paths, often two tables. Event Sourcing — the write model is an append-only event log; current state is reconstructed by replaying events.
You can do CQRS without ES (CRUD writes + an events table feeding read models). Most teams should. Event-sourcing the whole app is the textbook anti-pattern. Apply ES to one bounded context with rich behaviour — Ordering, say — and keep the rest as plain ActiveRecord. That single decision is what tends to save the project.
The signal that a context wants ES: complex state transitions, audit pressure, multiple stakeholders arguing about what happened. Pure CRUD admin screens never need it.
Events are facts, in past tense, kept small
Names: OrderSubmitted, OrderConfirmed, OrderShipped, StockReserved, PaymentCaptured. Past tense, business language, no Order.update. If you find yourself writing OrderUpdated, that’s a sign the actual business event hasn’t been named yet.
Payload rules that hold up over time:
- Identifiers + the minimum facts.
OrderShipped { order_id, tracking_number }. Not the whole order. The aggregate already has the rest. - Internal vs public events are different things. Events used to source aggregate state are private to the bounded context. Events crossing context boundaries are contracts and should change reluctantly.
correlation_idandcausation_idin metadata, always. Without them, the event store browser is unreadable. With them, you can click from aSubmitOrdercommand through every event and command it caused.
Schema evolution is where this stops being theoretical. Three options, in order of preference:
| Strategy | When | Cost |
|---|---|---|
Upcasting at read time (RubyEventStore::Mappers::Transformation::Upcast) | Adding optional fields, renaming, splitting one event into two | Free at write time, transformation cost on every read |
Versioned event types (OrderSubmittedV2) | Breaking semantic change | Two apply clauses in the aggregate forever |
In-place mutation via event_store.overwrite | Hot fix for poisoned data | Rewrites history; only as a last resort |
In-place mutation is genuinely a last resort. Upcasting transformations live next to the event class and are trivial to test.
Aggregates: AggregateRoot + a state machine + one stream per instance
module Ordering
class Order
include AggregateRoot
AlreadySubmitted = Class.new(StandardError)
NotSubmitted = Class.new(StandardError)
NotConfirmed = Class.new(StandardError)
def initialize(id)
@id, @state = id, :draft
end
def submit(order_number:, customer_id:)
raise AlreadySubmitted unless @state == :draft
apply OrderSubmitted.new(data: {
order_id: @id, order_number:, customer_id:
})
end
def confirm
raise NotSubmitted unless @state == :submitted
apply OrderConfirmed.new(data: { order_id: @id })
end
def ship(tracking_number:)
raise NotConfirmed unless @state == :confirmed
apply OrderShipped.new(data: { order_id: @id, tracking_number: })
end
on OrderSubmitted do |ev|
@state, @order_number, @customer_id =
:submitted, ev.data.fetch(:order_number), ev.data.fetch(:customer_id)
end
on OrderConfirmed do |_ev|
@state = :confirmed
end
on OrderShipped do |ev|
@state, @tracking_number = :shipped, ev.data.fetch(:tracking_number)
end
end
end
Two things to internalise:
- Public methods enforce invariants and
applyevents.on EventXblocks mutate state. No getters. The aggregate is not a data container — it’s the only place protecting business rules. - One stream per aggregate instance. Convention:
Ordering::Order$#{order_id}. The$is what the event store browser uses to group streams. Optimistic concurrency is automatic —AggregateRoot::Repository#with_aggregatereads the stream’s current version and writes back withexpected_version. Conflicting writers getWrongExpectedEventVersion.
Snapshots (AggregateRoot::SnapshotRepository) once a stream passes ~100 events. Below that, the replay cost is invisible.
Commands and a 10-line bus
The bus is a Hash[CommandClass] = handler. There’s a published gem for it; rolling your own stays under 10 lines and makes the dispatch path obvious:
class CommandBus
def initialize = @handlers = {}
def register(command_class, handler) = @handlers[command_class] = handler
def call(command)
@handlers.fetch(command.class) { raise "no handler for #{command.class}" }
.call(command)
end
end
Then wrap it in RubyEventStore::CorrelatedCommands so events emitted inside a handler inherit the command’s correlation_id and causation_id automatically. That one wrapper is the difference between a readable event store browser and an untraceable mess.
Commands are dry-struct contracts. Validation lives on the command (shape, types, presence). Business rules live on the aggregate (AlreadySubmitted, NotConfirmed). Don’t mix the two — dry-validation for “is this a well-formed request”, aggregate exceptions for “is this allowed right now”.
Read models live in the app, not the domain
The standard term: denormalizer. A subscriber that listens to events and writes a purpose-built ActiveRecord table tailored to one screen.
class OrdersListDenormalizer < ApplicationJob
prepend RailsEventStore::AsyncHandler
def perform(event)
case event
when Ordering::OrderSubmitted
OrdersList.create!(order_id: event.data.fetch(:order_id), state: "submitted", ...)
when Ordering::OrderConfirmed
OrdersList.where(order_id: event.data.fetch(:order_id)).update_all(state: "confirmed")
end
end
end
Each read model gets its own table, owned by the application layer, not the bounded context. Querying is direct AR — no joins to write-side tables, no scopes that pretend to know the domain.
Sync vs async: default async via AfterCommitAsyncDispatcher, which schedules the job only after the surrounding ActiveRecord::Base.transaction commits. This avoids the ugliest bug in event-driven Rails: queueing an order-confirmation email for an order that got rolled back. Use sync only when the read model has to be visible in the same request that produced the command.
Read models are disposable. Truncate, replay events, rebuild. Worth a one-line rake task wired in early — it gets used constantly:
namespace :read_models do
task rebuild: :environment do
OrdersList.delete_all
Rails.configuration.event_store.read.of_type([
Ordering::OrderSubmitted, Ordering::OrderConfirmed, Ordering::OrderShipped
]).each { |e| OrdersListDenormalizer.new.perform(e) }
end
end
Process managers: the only place cross-context coordination lives
When OrderSubmitted needs to wait for StockReserved and PaymentCaptured before issuing ConfirmOrder, that orchestration goes in a process manager — not in any of the three contexts.
class OrderConfirmationProcess
def call(event)
order_id = event.data.fetch(:order_id)
stream = "OrderConfirmation$#{order_id}"
@event_store.link([event.event_id], stream_name: stream, expected_version: :any)
state = build_state_from(stream)
@command_bus.call(Ordering::ConfirmOrder.new(order_id:)) if state.ready?
end
end
Two patterns to consider, in order:
- State in an ActiveRecord row. Migration-heavy. Hard to debug. Skip.
- State in its own event stream, built by linking the triggering events into a
OrderConfirmation$#{order_id}stream and rebuilding the state object on each call. This is the one that holds up. The browser shows the whole conversation in one place.
The moment an event handler in Ordering calls a method on Inventory, extract a process manager. Rule: direct cross-context method calls are forbidden. Events go out, commands come in, the process manager is the bridge.
Bounded contexts are folders
domains/ so the dependency graph stays explicit.One Rails monolith. One RailsEventStore::Client. Each context exposes events, commands, aggregates, and a Configuration class that registers handlers in to_prepare. Cross-context contact = published events only.
Disable Zeitwerk inside domains/ and use explicit require_relative. This sounds heretical until you realise it makes the dependency graph between contexts visible in plain text. Zeitwerk’s autoload hides exactly the coupling you’re trying to control.
Microservices are tempting and almost always a mistake here. Folder-level boundaries deliver 95% of the architectural benefit at 5% of the operational cost.
Testing
Three layers, all using rails_event_store-rspec:
# 1. Aggregate — given/when/then
RSpec.describe Ordering::Order do
it "ships only after confirm" do
order = described_class.new("o1")
order.submit(order_number: "X", customer_id: "c1")
expect { order.ship(tracking_number: "T") }
.to raise_error(Ordering::Order::NotConfirmed)
end
end
# 2. Command handler — InMemoryRepository, full slice
RSpec.describe Ordering::OnSubmitOrder do
let(:event_store) { RailsEventStore::Client.new(repository: RubyEventStore::InMemoryRepository.new) }
it "publishes OrderSubmitted on the order's stream" do
described_class.new(repository: Ordering::OrderRepository.new(event_store:))
.call(Ordering::SubmitOrder.new(order_id: "o1", customer_id: "c1"))
expect(event_store).to have_published(an_event(Ordering::OrderSubmitted))
.in_stream("Ordering::Order$o1")
end
end
# 3. Denormalizer — arrange events, act, assert AR state
RSpec.describe OrdersListDenormalizer do
it "creates a row on submit" do
described_class.new.perform(
Ordering::OrderSubmitted.new(data: { order_id: "o1", order_number: "X", customer_id: "c1" })
)
expect(OrdersList.find_by(order_id: "o1").state).to eq("submitted")
end
end
InMemoryRepository makes aggregate and handler specs run in milliseconds. have_applied for unit tests of aggregates, have_published for integration tests through the event store. Mutation testing (mutant) on the domain code only — not on read models or controllers, where it’s noise.
What goes wrong
| Mistake | Symptom | Fix |
|---|---|---|
| God Order aggregate | Every change touched the same class; merge conflicts daily | Split into Ordering, Pricing, Shipping aggregates communicating via events |
| All denormalizers sync | p95 latency on POST /orders over a second | prepend RailsEventStore::AsyncHandler, sync only the one read model needed for the redirect |
ImmediateAsyncDispatcher | Confirmation emails sent for rolled-back orders | Switch to AfterCommitAsyncDispatcher |
| Mutating event schemas in place | Production data interpreted by the wrong code path | Upcasting transformations |
| No correlation IDs | ”Why did this email send?” took an hour each time | RubyEventStore::CorrelatedCommands wrapper, LinkByCorrelationId subscriber |
| Process manager state in AR | Migration-coupled to a state machine that kept changing | Event-sourced PM via event_store.link |
| Anemic aggregates with public state | Business logic leaked into handlers and controllers | Aggregates expose only domain methods; state is private |
Registering handlers outside to_prepare | Stale class references in dev after reload | Move every subscribe and register into config.to_prepare |
When not to bother
CQRS+ES is genuinely heavy. A reasonable decision matrix:
- Use it when you have rich business workflows, audit obligations, or temporal queries (“what was this order’s state on March 12?”). Orders, invoices, claims, bookings, learning processes.
- Use only CQRS (read models from CRUD writes) when reads diverge from writes but the write side is mostly form-fills.
- Use neither for reference data, admin screens, single-team CRUD, or anything where the team isn’t ready to commit to past-tense event names and ubiquitous language. The patterns punish half-measures.
TL;DR: Apply ES to one bounded context, not the whole app. Events are past-tense facts with small payloads. Aggregates own invariants and emit events. Read models are disposable AR tables built by async denormalizers. Process managers — and only process managers — coordinate across contexts. Mount the browser, propagate correlation IDs, dispatch async after commit, never mutate events. Test with
InMemoryRepositoryandhave_published. The complexity buys you audit, replay, and clarity; if your domain doesn’t need those, stay with ActiveRecord.