Command Palette

Search for a command to run...

Level 3 · 25 min

Replication

PostgreSQL streaming replication provides high availability and read scaling. Understanding WAL-based replication, replication slots, and failover tools is essential for building production PostgreSQL HA deployments.

Streaming Replication

Streaming replication works by streaming WAL (Write-Ahead Log) segments from primary to standby in near-real-time. The standby applies WAL to keep in sync. Asynchronous mode (default): primary does not wait for standby to confirm WAL receipt — lowest latency, but some transactions may be lost if primary fails before standby receives them (RPO > 0). Synchronous mode (synchronous_commit = on): primary waits for at least one standby to confirm WAL receipt before acknowledging the write — zero data loss (RPO = 0), but adds latency (one RTT to standby). synchronous_standby_names specifies which standbys must confirm. Hot Standby: standbys serve read-only queries while replicating. replication lag: delay between primary WAL position and standby position — monitor with pg_stat_replication.

Replication Slots

Replication slots guarantee that the primary retains WAL segments until the slot consumer (standby or logical replication client) has consumed them. Without slots, the primary can recycle WAL segments before the standby reads them — causing the standby to need a full resync. With slots, WAL is retained until consumed. Critical risk: if a slot's consumer falls offline for a long time, WAL accumulates indefinitely, filling disk. Monitor pg_replication_slots for slot lag (confirmed_flush_lsn vs wal_lsn). Drop inactive slots that won't reconnect: SELECT pg_drop_replication_slot('slot_name'). Logical replication slots are used by logical replication and change data capture (Debezium, pgoutput). From PostgreSQL 14 Internals (Rogov): WAL is organized into 16MB segment files by default (wal_segment_size). Checkpoints flush all dirty shared_buffers pages to disk and write a checkpoint record in WAL; the checkpoint_completion_target parameter (default 0.9) spreads the dirty-buffer writes over 90% of the interval between checkpoints to avoid I/O spikes. The max_wal_size parameter (default 1GB) sets a soft cap on WAL accumulation between checkpoints — exceeding it triggers an extra checkpoint. Setting wal_level = replica (the default) enables streaming replication; wal_level = logical additionally logs sufficient information for logical decoding. The max_wal_senders parameter (default 10) caps concurrent replication connections; setting it to 0 with wal_level=minimal disables replication entirely and reduces WAL volume.

Failover and HA Tools

When the primary fails, a standby must be promoted to become the new primary. Manual promotion: pg_promote() or touch the trigger file. Patroni: a popular HA solution using etcd/Consul/ZooKeeper for distributed consensus. Patroni manages: automatic failover (promotes the most up-to-date standby), fencing (ensures the old primary cannot accept writes after demotion — prevents split-brain), configuration management, and connection re-routing via HAProxy. repmgr: older HA solution with simpler setup. PgBouncer: connection pooler that reduces connection overhead and enables zero-downtime failover (re-routes connections to the new primary). After failover, old primary must resync as a new standby using pg_basebackup or pg_rewind.

Key Takeaways

  • Asynchronous replication = low latency, some data loss risk. Synchronous replication = no data loss, higher latency. Choose based on RPO requirements.
  • Replication slots prevent the primary from discarding WAL before the standby reads it. The risk is disk fill if the standby is offline for too long. Always monitor slot lag.
  • Patroni handles automatic failover and fencing. Fencing is critical: without it, a failover scenario can leave two primaries (split-brain), corrupting data.

Code example

-- Check replication status\nSELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,\n  (sent_lsn - replay_lsn) AS lag_bytes\nFROM pg_stat_replication;\n\n-- Check replication slot lag\nSELECT slot_name, active, pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes\nFROM pg_replication_slots;