Command Palette

Search for a command to run...

Level 2 · 25 min

MVCC

PostgreSQL's MVCC (Multi-Version Concurrency Control) enables readers and writers to operate concurrently without blocking each other. Understanding dead tuples, VACUUM, and transaction IDs is essential for avoiding subtle performance and correctness issues.

How MVCC Works

In MVCC, each row version has two system columns: xmin (the transaction ID that created this row version) and xmax (the transaction ID that deleted or updated this row version, 0 if current). When a transaction reads a row, it checks visibility rules: is xmin committed and <= my snapshot transaction ID? Is xmax 0, or not yet committed, or > my snapshot? If yes, the row is visible. Each transaction gets a snapshot at its start: an array of in-progress transaction IDs. This snapshot is immutable for the duration of the transaction (in READ COMMITTED, a new snapshot is taken at each statement). This means readers never block writers and writers never block readers.

Dead Tuples and Table Bloat

When a row is updated, PostgreSQL does NOT modify the row in place. Instead: it marks the old row version as deleted (sets xmax) and inserts a new row version. Both versions exist in the heap until VACUUM removes the old one. This means UPDATE on 1 row = 2 heap entries temporarily. DELETE also leaves a dead tuple. Dead tuples consume disk space and slow down sequential scans (must skip dead tuples). Autovacuum automatically reclaims dead tuple space — it runs when the dead tuple count exceeds a threshold (autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * n_live_tup). Very high UPDATE rates can outpace autovacuum, causing table bloat. From PostgreSQL 14 Internals (Rogov): each row version header contains xmin (the transaction ID that inserted it) and xmax (the transaction ID that deleted/updated it). The infomask bits encode commit/abort status, enabling visibility checks without consulting pg_clog on every access. An optimization called Heap-Only Tuple (HOT) avoids creating a new index entry when the updated column is not part of any index — the index still points to the first version, and a chain of ctid pointers in the heap links to successive versions within the same page. HOT dramatically reduces index bloat on frequently-updated tables. Transaction IDs are 32-bit integers and wrap around at approximately 2.1 billion; the freezing mechanism (vacuum_freeze_min_age default 50M, autovacuum_freeze_max_age default 200M) replaces old xmin values with a special 'frozen' marker before wraparound occurs, preventing the database from making all visible rows appear to be in the future.

VACUUM and Transaction ID Wraparound

VACUUM has two jobs: reclaim dead tuple space (allow the space to be reused), and freeze old transaction IDs to prevent wraparound. PostgreSQL transaction IDs are 32-bit integers — they wrap around after ~2.1 billion transactions. VACUUM FREEZE marks old tuples with a special 'frozen' xmin that is always visible, preventing them from appearing invisible after wraparound. Transaction ID wraparound is the most catastrophic PostgreSQL failure: the database shuts down all writes to prevent data corruption. Monitor: age(datfrozenxid) in pg_database. When it approaches 1.6B transactions, emergency VACUUM FREEZE is required. Autovacuum handles this automatically in healthy databases but can be blocked by long-running transactions.

Key Takeaways

  • MVCC means readers and writers don't block each other. But it creates dead tuples on every UPDATE/DELETE — VACUUM is needed to reclaim this space.
  • Long-running transactions prevent VACUUM from cleaning dead tuples (they hold the oldest snapshot, and VACUUM can't remove tuples visible to that snapshot). Monitor pg_stat_activity for long transactions.
  • Transaction ID wraparound is a critical risk. Monitor age(datfrozenxid) in pg_database. Autovacuum handles it automatically, but aggressive autovacuum tuning is required for high-write tables.

Code example

-- Check table bloat (dead tuples)\nSELECT relname, n_dead_tup, n_live_tup,\n  round(n_dead_tup::numeric/nullif(n_live_tup,0)*100, 1) AS dead_pct\nFROM pg_stat_user_tables\nORDER BY n_dead_tup DESC;\n\n-- Check transaction ID age (wraparound risk)\nSELECT datname, age(datfrozenxid) FROM pg_database\nORDER BY age(datfrozenxid) DESC;