PostgreSQL on-disk filesystem visualizer
A single-file visualizer of how PostgreSQL turns SQL into bytes on disk.
On the left, a psql terminal pre-loaded with a script that exercises
every interesting on-disk structure. On the right, the live PGDATA tree
— directories appear, files spring into existence, pages extend, line
pointers march forward and the WAL records pile up — all driven by the
same op stream the terminal is replaying.
The script walks through:
CREATE DATABASE shop; → mkdir base/16384/, clone template1
\c shop → switch session
CREATE TABLE customers ... → alloc heap relfilenode + _fsm + _vm + pkey idx
INSERT ×3 → page 0 extends; LPs grow; xmin set; WAL append
SELECT * FROM customers → Seq Scan via buffer cache
UPDATE id = 1 → HOT update: t_xmax on old, new ctid chained
DELETE id = 2 → tuple marked dead, storage NOT yet freed
CREATE INDEX (email) → new relfilenode, B-tree build
VACUUM customers → reclaim dead tuples, set VM all-visible
CHECKPOINT → fsync all dirty pages, advance pg_control.redo
CREATE TABLE orders ... PARTITION BY RANGE(ts)
+ 2 partitions (q1, q2) → parent.relkind='p' has no storage
INSERT INTO orders ... → ExecPartitionRouting routes to correct child
SELECT join → Index Scan + Append over partitions
Views
| View | What it shows |
|---|---|
| filesystem | The live PGDATA tree. Per-file metadata (pages, free bytes, last LSN, partition bound), with new files green-flashed and currently-written files orange-flashed. Click any node to inspect. |
| page | A scaled byte-strip of the selected 8 KB heap page — PageHeaderData (24 B), the line-pointer array growing from the bottom, free space, and tuples filling from the top. Below it, the HeapTupleHeaderData field tree (xmin / xmax / ctid / infomask / hoff) for the first live tuple. |
| WAL | Every record appended to pg_wal/ with its LSN, resource manager, tag and length. Click for xl_tot_len / xl_xid / xl_prev / xl_crc plus prose notes on what that record means. |
| B-tree | Meta page → root → leaf pages of every index, with (key → ctid) pairs visible at the leaves. |
What’s “precise” about it
-
Real OIDs and relfilenodes. Catalog files in
global/are named with the hard-coded relfilenodes pg actually uses (1262forpg_database,1260forpg_authid,1213forpg_tablespace). New user objects allocate fromFirstNormalObjectId = 16384upward. -
Real fork names. Each heap relation creates three files:
<relfilenode>(main),<relfilenode>_fsm(free-space map) and<relfilenode>_vm(visibility map). Indexes only have a main fork. The init fork (_init) is shown but unused — it’s the template for unlogged tables. -
Real page layout. Each page is exactly 8192 B, with a 24-byte
PageHeaderData, then an upward-growingItemId[]array of 4-byte line pointers (lp_off,lp_len,lp_flags), free space, downward- growing tuple area, and an optional special-space region (used by index AMs). Updating a tuple in place is forbidden — pg always inserts a new version, then chains the old viat_ctid. -
Real MVCC semantics. Every tuple carries
t_xmin(inserting xid) andt_xmax(deleting/updating xid).DELETEonly setst_xmax; storage is not reclaimed untilVACUUM, which markslp_flags = LP_DEADand returns the bytes to the FSM. AnUPDATEalso chainst_ctidfrom the old version to the new (a HOT update if both fit on the same page). -
Real WAL record types.
Heap/INSERT,Heap/HOT_UPDATE,Heap/DELETE,Heap2/PRUNE,Heap2/VISIBLE,Btree/INSERT_LEAF,Btree/NEWROOT,Storage/CREATE,XLOG/CHECKPOINT_ONLINE,XLOG/FPI,Transaction/COMMIT,DBASE/CREATE— all real resource-manager + info-tag pairs fromsrc/include/access/xlog_internal.h. -
Real CLOG layout.
pg_xact/stores 2 bits per xid (IN_PROGRESS / COMMITTED / ABORTED / SUB_COMMITTED); an 8 KB segment holds 32 768 xids. New segments are spun up as0000,0001, … as the xid counter advances. -
Real partitioning. The parent in
CREATE TABLE ... PARTITION BY RANGE(ts)hasrelkind = 'p'and gets no relfile. Each child partition allocates its own heap + _fsm + _vm;INSERTinto the parent walksExecPartitionRoutingto pick the right child by the range bound recorded inpg_class.relpartbound. -
Real B-tree shape. A meta page (block 0) points at the current root; the root holds (separator-key, downlink) pairs; the leaves hold (key, heap-tid) pairs and form a doubly-linked sibling list at the bottom.
Controls
| Key / mouse | Action |
|---|---|
space | Run / pause the script |
s | Step one operation |
shift+s | Step ×10 |
r | Reset (re-init PGDATA, clear WAL/CLOG) |
v | Cycle filesystem → page → WAL → B-tree |
| click on a file | Open the inspector (path, OID, columns, fork, pages, LSNs) |
| click on a page | Jump to the page view |
| click on a WAL record | Show its xl_* header + payload |
What the data flow looks like
Every command produces a chain of low-level operations. For an INSERT:
psql: INSERT INTO customers VALUES (1, 'alice@…');
↓
parser: raw query → parse tree
planner: Insert on customers (heap)
↓
WAL append: rmgr=Heap tag=INSERT rfn=16385 blk=0 off=1 len=78 B
↓
heap mutate: page 0: line-pointer [1] added · upper -= tup_size · t_xmin = 743
↓
FSM update: block 0 free space = 8120 B
VM clear: block 0 no longer all-visible
↓
btree leaf: rmgr=Btree tag=INSERT_LEAF key=1 tid=(0,1)
↓
commit: WAL rmgr=Transaction tag=COMMIT xid=743 · CLOG bit ← COMMITTED · fsync
Each of those steps shows up as a notification at the bottom of the simulator, with the affected file flashing in the tree.