Fix common cluster issues
Stack ECH ECK ECE Self-Managed
Use these topics to fix common issues with Elasticsearch clusters.
Simplify monitoring with AutoOps
Use AutoOps in your Elastic Cloud Hosted, ECE, ECK, or self-managed deployments.
AutoOps is a monitoring tool that simplifies cluster management through performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. For more information, refer to AutoOps.
- Watermark errors
- Fix watermark errors that occur when a data node is critically low on disk space and has reached the flood-stage disk usage watermark.
- Circuit breaker errors
- Elasticsearch uses circuit breakers to prevent nodes from running out of JVM heap memory. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops the operation and returns an error.
- Symptom: High CPU usage
- The most common causes of high CPU usage and their solutions.
- High JVM memory pressure
- High JVM memory usage can degrade cluster performance and trigger circuit breaker errors.
- Red or yellow cluster health status
- A red or yellow cluster status indicates one or more shards are missing or unallocated. These unassigned shards increase your risk of data loss and can degrade cluster performance.
- Rejected requests
- When Elasticsearch rejects a request, it stops the operation and returns an error with a
429
response code. - Task queue backlog
- A backlogged task queue can prevent tasks from completing and put the cluster into an unhealthy state.
- Mapping explosion
- A cluster in which an index or index pattern as exploded with a high count of mapping fields which causes performance look-up issues for Elasticsearch and Kibana.
- Hot spotting
- Hot spotting may occur in Elasticsearch when resource utilizations are unevenly distributed across nodes.