Fix common cluster issues

Stack ECH ECK ECE Self-Managed

Use these topics to fix common issues with Elasticsearch clusters.

Simplify monitoring with AutoOps

Use AutoOps in your Elastic Cloud Hosted, ECE, ECK, or self-managed deployments.

AutoOps is a monitoring tool that simplifies cluster management through performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. For more information, refer to AutoOps.

Watermark errors: Fix watermark errors that occur when a data node is critically low on disk space and has reached the flood-stage disk usage watermark.
Circuit breaker errors: Elasticsearch uses circuit breakers to prevent nodes from running out of JVM heap memory. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops the operation and returns an error.
Symptom: High CPU usage: The most common causes of high CPU usage and their solutions.
High JVM memory pressure: High JVM memory usage can degrade cluster performance and trigger circuit breaker errors.
Red or yellow cluster health status: A red or yellow cluster status indicates one or more shards are missing or unallocated. These unassigned shards increase your risk of data loss and can degrade cluster performance.
Rejected requests: When Elasticsearch rejects a request, it stops the operation and returns an error with a 429 response code.
Task queue backlog: A backlogged task queue can prevent tasks from completing and put the cluster into an unhealthy state.
Mapping explosion: A cluster in which an index or index pattern as exploded with a high count of mapping fields which causes performance look-up issues for Elasticsearch and Kibana.
Hot spotting: Hot spotting may occur in Elasticsearch when resource utilizations are unevenly distributed across nodes.

Fix common cluster issues

Additional resources