I'm using Amazon Elasticsearch Service, and my domain is performing poorly, or my clusters have a red status. What infrastructure should I use, and how do I scale up?

Issues with poor Amazon ES domain or cluster performance are usually caused by a lack of storage, overutilized CPUs, insufficient memory, or a combination of these factors. The effects of insufficient capacity can include:

  • Poor performance
  • Red cluster status
  • Blocked writes or index creation
  • Failed snapshot creation
  • Data loss

Identify the issue

To identify the source of these issues, check the CloudWatch metrics for your Amazon Elasticsearch infrastructure, especially the following metrics:

  • FreeStorageSpace: This metric measures the number of free megabytes of storage for the nodes in your cluster. If you have low or no free storage space on your cluster, adding more storage space can help resolve the issue.
  • CPUUtilization: This metric measures the percentage of CPU resources used by all data nodes in the cluster. If this metric is consistently at or above 70, adding more CPU resources can help resolve the issue.
  • JVMMemoryPressure: This metric measures the maximum percentage of the Java heap used for all data nodes in the cluster. If this metric is consistently above 75, adding more memory can help resolve the issue. Additionally, a domain that is low on memory can consume more CPU resources, so increasing the amount of memory available to your resources might also help decrease CPU utilization.

When you have identified the potential causes of the issues with your Amazon ES domain or cluster, consider one or more of the following strategies:

Reduce the number of resources your application uses

Tune the amount of memory used by filters, caches, and aggregations, and consider using doc values. You might also consider reducing the overall number of shards in your cluster, or simplifying your index mapping. These strategies can reduce the amount of memory and CPU resources required to run your application, and they can potentially eliminate the need to add more resources to your domain or cluster.

Add more storage

If you are using Amazon Elastic Block Store (Amazon EBS), increase the size of the EBS volumes used by your domain or cluster.

If you are not using Amazon EBS, add additional nodes to your cluster configuration.

Add more CPU resources

To add more CPU capacity, switch to a larger instance size in your clusters, or add more nodes to your cluster. Also consider adding dedicated master nodes, if you do not have them already.

Add more memory

To add more memory, consider switching to an instance type with more memory in your clusters, or add more nodes to your cluster.

Configure Amazon CloudWatch alarms

As a preventative measure, consider configuring CloudWatch alarms to notify you when one or more of the preceding metrics are above certain thresholds (for example, if CPU utilization is above 80%).

Consider updating your Amazon ES version

While this might require some work, such as preparing for a schema change on your domain, it's best practice to use the current version of Amazon ES whenever possible, as newer versions usually offer significantly better performance. For more information, see Migrating to a Different Elasticsearch Version.

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2016-11-11

Updated: 2017-11-20