Level 3 · 30 min

Query Optimization

Query optimization in MongoDB requires understanding how the query planner selects execution plans, how to read explain() output, and when queries are served from indexes vs full collection scans.

explain() and Execution Plans

The explain() method reveals how MongoDB executes a query. Three verbosity levels: queryPlanner (plan selection, no execution), executionStats (runs the query, returns stats), allPlansExecution (runs all candidate plans, expensive). Key stages to read: COLLSCAN means the query performed a full collection scan — no usable index. IXSCAN means an index was used — check keysExamined vs nReturned: if keysExamined >> nReturned, the index has low selectivity or the query can't fully use it. FETCH appears after IXSCAN when documents must be loaded to retrieve projected fields (covered queries avoid FETCH). SORT appears when the sort cannot be satisfied by index order — this is an in-memory sort that can be slow and memory-intensive.

Query Planner and Cache

When multiple indexes could satisfy a query, the query planner runs candidate plans in parallel (trial period) and selects the winner based on which plan returns the first 101 results fastest. The winning plan is cached by query shape (the structure of the query, without literal values). The plan cache is invalidated when: an index is added or dropped, collection statistics change significantly, mongod restarts. The query shape determines cache hit — queries with the same operators but different values reuse the cached plan. Cache entries have a state: active (winning plan) or inactive (needs re-evaluation). Use db.collection.getPlanCache().list() to inspect cached plans. Key insight from MongoDB: The Definitive Guide (3rd ed.): the authors demonstrate with a real explain() output that a COLLSCAN on a 1M-document collection examining all 1,000,000 documents for a single result ('nReturned: 1, totalDocsExamined: 1,000,000') took 419ms — dropping to under 1ms with a proper index (IXSCAN with keysExamined: 1). The ratio of totalDocsExamined to nReturned is the key signal: a ratio near 1:1 indicates an efficient, highly selective index; a ratio of 1,000,000:1 indicates no usable index.

Covered Queries and Projections

A covered query is satisfied entirely from index data — MongoDB never reads the actual documents. Requirements: all query filter fields are in the index, all projected fields are in the index (and _id is either excluded or in the index). When covered, explain() shows no FETCH stage. Covered queries are significantly faster, especially for large collections where documents don't fit in RAM. Index hint forces the planner to use a specific index: db.collection.find(query).hint(indexName). Useful for testing or when the planner makes a wrong choice. Natural hint (hint 1) forces a collection scan — use for profiling comparisons.

Key Takeaways

COLLSCAN means no index used — always the first thing to fix. IXSCAN means an index was used, but check keysExamined vs nReturned for selectivity.
The query planner caches winning plans by query shape. Adding a new index can invalidate the cache and change which plan wins for all queries of that shape.
Covered queries (no FETCH stage) are the fastest — all data returned from the index. Design indexes to cover your most critical read queries.

Code example

// Basic explain\ndb.users.find({email: 'user@example.com}).explain('executionStats')\n\n// Key fields to check in explain output:\n// executionStats.totalDocsExamined vs totalKeysExamined vs nReturned\n// queryPlanner.winningPlan.stage: COLLSCAN or IXSCAN\n// queryPlanner.winningPlan.inputStage.stage: FETCH (not covered)\n\n// Force a specific index (for testing)\ndb.users.find({email: 'user@example.com}).hint({email: 1})