Level 3 · 30 min

Cluster Management

Elasticsearch is a distributed system. Understanding node roles, master election, cluster health, and recovery mechanisms is essential for designing resilient clusters and diagnosing outages.

Node Roles and Architecture

Every Elasticsearch node can have one or more roles. master-eligible nodes participate in master election and manage cluster state (index creation, shard allocation, node joins). Dedicated master nodes should not hold data shards — they need stable JVM pauses and low CPU load. data nodes store shards and handle search/indexing. coordinating-only nodes (no roles) route requests, merge shard results, and offload work from data nodes — useful for aggregation-heavy workloads. ingest nodes run ingest pipelines (pre-processing before indexing). Hot-warm-cold architecture: hot nodes (SSD, powerful CPUs) hold recent data; warm nodes (HDD, less CPU) hold older data; cold nodes use frozen indices on object storage.

Master Election and Split-Brain

Elasticsearch uses a Raft-based consensus protocol (since 7.0) for master election. When the current master becomes unavailable, master-eligible nodes elect a new master. The quorum requirement prevents split-brain: with N master-eligible nodes, the quorum is (N/2)+1. A 3-node master cluster tolerates 1 failure (quorum = 2). A 5-node master cluster tolerates 2 failures (quorum = 3). In ES 6 and earlier, the minimum_master_nodes setting (= N/2 + 1) was critical — misconfiguring it caused split-brain where two masters existed simultaneously, corrupting cluster state. ES 7+ handles this automatically via Raft. Never use an even number of master-eligible nodes — it doesn't increase fault tolerance but complicates quorum math. Gormley and Tong describe cluster health as "the single most important statistic in an Elasticsearch cluster, which reports a status of either green, yellow, or red." — Clinton Gormley & Zachary Tong, Elasticsearch: The Definitive Guide. Green: all primary and replica shards are active. Yellow: all primaries are active but at least one replica is unassigned (data is safe but redundancy is reduced). Red: at least one primary shard is unassigned (active data loss for those shards). Shard allocation is controlled by cluster-level settings: cluster.routing.allocation.enable (all/primaries/none) lets operators pause rebalancing during rolling restarts. Setting this to 'primaries' before stopping a node prevents the cluster from eagerly reallocating shards to other nodes — avoiding a rebalance storm that would be immediately reversed when the node rejoins.

Snapshot and Restore

Snapshots are the primary backup mechanism for Elasticsearch. They are incremental — each snapshot stores only the segments that changed since the last one. Repository types: fs (shared filesystem), s3, gcs, azure. SLM (Snapshot Lifecycle Management) automates snapshot scheduling and retention policies. Restoration: you can restore individual indices, rename them during restore, and restore to a different cluster. Cross-cluster replication (CCR) replicates indices in near-real-time to a follower cluster — for geo-redundancy or offloading read traffic. Searchable snapshots mount indices directly from the repository without full restoration, enabling cold/frozen tiers.

Key Takeaways

Use dedicated master nodes (3 or 5, never 2 or 4) to prevent split-brain. Master nodes must never hold data shards — they need stable heap and low GC pauses.
ES 7+ uses Raft for master election automatically. In ES 6, misconfiguring minimum_master_nodes was the leading cause of split-brain cluster corruption.
Snapshots are incremental. Set up SLM for automated backups. Use CCR for active-active geo-redundancy or offloading read traffic.

Code example

// Check cluster health\nGET /_cluster/health\n\n// Check shard allocation\nGET /_cluster/allocation/explain\n\n// Create snapshot repository (S3)\nPUT /_snapshot/my_backup\n{"type": "s3", "settings": {"bucket": "my-es-snapshots"}}' \n\n// Trigger manual snapshot\nPUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true