Level 2 · 25 min

Index Types

PostgreSQL supports multiple index types optimized for different data patterns and query operators. Choosing the right index type is the difference between microsecond and second-level query times.

B-tree and When Not to Use It

B-tree is the default index type, supporting =, <, >, <=, >=, BETWEEN, IN, LIKE 'prefix%', and ORDER BY. B-tree is not useful for: contains operators (JSONB @>, array @>), full-text search (@@ operator), geometric operators. For a boolean column (true/false, 50/50 distribution), a B-tree index has near-zero selectivity — the planner will ignore it and choose Seq Scan. Partial indexes solve this: CREATE INDEX idx_active ON users(id) WHERE active = true indexes only the minority value, making it highly selective. Expression indexes index the result of a function: CREATE INDEX ON users(lower(email)) to enable case-insensitive email lookups.

GIN, GiST, and BRIN

GIN (Generalized Inverted Index): optimized for composite values (arrays, JSONB, tsvector). Stores an inverted index: for each element in the array/document, stores which rows contain it. Ideal for JSONB @> (containment), array && (overlap), and tsvector @@ (full-text). GIN has slower write performance (index maintenance for arrays) but very fast reads for containment queries. GiST (Generalized Search Tree): extensible index for complex data types — geographic data (PostGIS), ranges, full-text (with ranking). GiST supports nearest-neighbor searches. BRIN (Block Range Index): stores the min/max value per block range (e.g., every 128 pages). Tiny index size, perfect for naturally ordered large columns (created_at, log_id) where values increase over time. Not useful for randomly ordered data. From PostgreSQL 14 Internals (Rogov): GIN indexes allow deferred insertion via the fastupdate storage parameter — new entries are quickly appended to an unordered pending list, then bulk-merged into the main tree structure in the background, making writes far cheaper. The tradeoff: queries must also scan the pending list, adding overhead. The pending list size is capped by gin_pending_list_limit (default 4MB); when full, it is flushed synchronously during an INSERT, creating a latency spike. Covering indexes (CREATE INDEX ... INCLUDE (col)) add non-key columns to the index leaf pages, enabling Index Only Scan without touching the heap — critical for hot read paths where even a single heap fetch per row adds up under high QPS.

Partial and Covering Indexes

Partial indexes (WHERE clause) index only a subset of rows: CREATE INDEX ON orders(created_at) WHERE status = 'pending'. This is smaller than a full index and avoids indexing rows that are never queried with this condition. Covering indexes (INCLUDE clause, PostgreSQL 11+) store additional columns in the index leaf pages without including them in the B-tree key: CREATE INDEX ON orders(user_id) INCLUDE(status, amount). The INCLUDE columns are not used for index scans but enable Index Only Scans when the query projects only those columns, avoiding the heap fetch entirely. Index maintenance has write cost: each INSERT/UPDATE/DELETE must update all indexes. Many indexes on a write-heavy table can significantly degrade write throughput.

Key Takeaways

B-tree is for equality, range, and sort. GIN is for arrays and JSONB containment. GiST is for geographic and complex types. BRIN is for large ordered time-series columns.
Partial indexes are small and fast for sparse conditions. Covering indexes (INCLUDE) enable Index Only Scans and avoid heap fetches for the projected columns.
Every index has a write cost. A table with 20 indexes can have 10x the write overhead of a table with 2 indexes. Regularly review pg_stat_user_indexes for unused indexes and drop them.

Code example

-- GIN for JSONB containment\nCREATE INDEX idx_metadata_gin ON events USING GIN(metadata);\nSELECT * FROM events WHERE metadata @> {"type": "click"};\n\n-- BRIN for timestamp (naturally ordered)\nCREATE INDEX idx_created_brin ON logs USING BRIN(created_at);\n\n-- Partial + covering for active orders\nCREATE INDEX idx_pending ON orders(user_id, created_at DESC)\nINCLUDE(status, amount)\nWHERE status = 'pending';