Kriti explains how the number
of shards might affect an
Amazon ES cluster

Kriti_BLR0718

My Amazon Elasticsearch Service domain is stuck in the Processing state after I made a configuration change. How do I resolve this?

Note: There are many possible causes of this problem, including high CPU utilization and JVM memory pressure due to large query loads, disk problems, node failures, etc. Only one of the possible root causes, too many shards, is discussed here.

When your cluster has a large number of shards, you might experience slow response times, high CPU usage on the master node, snapshot failures, or stuck configuration changes. For more information about calculating the number of shards that you need, see Choosing the Number of Shards.

Check shards

The outputs of the following operations will indicate if your cluster has a large number of shards, if shards are unallocated, or if the nodes in your cluster are not able to store all of the shards because of high disk usage or high JVM memory pressure. These problems can cause a cluster to be stuck in the Processing state for a long time. These problems can usually be solved by reducing the number of shards in your cluster.

Call the following API operations to determine if shards are migrating, and if so, how quickly. Amazon ES migrates shards during snapshot recoveries, certain domain configuration changes, replication level changes, node failures, and node startup. When you have a large number of shards, migrations are slow and can cause your cluster to become stuck in the Processing state.

$ curl -XGET "<ES_Endpoint>/_cat/recovery?v&active_only"

Wait a few minutes, and then run the above command again. If there is a change in the output, shards are still migrating.

Call the following API operation to see how many shards are allocated to each data node and how much disk space they are using:

$ curl -XGET <ES_Endpoint>/_cat/allocation?v

Check the number of shards and indices in the cluster by calling the _cat API operation:

$ curl -XGET <ES_Endpoint>/_cat/indices
$ curl -XGET <ES_Endpoint>/_cat/shards

Consider an architecture that minimizes the number of shards in a domain, while still fitting your use case. For more information, see Get Started with Amazon Elasticsearch Service: How Many Shards Do I Need?

Reduce the number shards in the cluster

To reduce the number of shards in your cluster and reduce overhead, remove indices that you no longer need:

$ curl -XDELETE <ES_Endpoint>/oldindex1,oldindex2

If you have an index rotation configured, each rotation will result in a new set of shards being created. By default, Amazon ES creates five primary shards per index and one replica shard for every primary shard. This means that index rotation can eventually lead to an overloaded master node. If your use case allows, change your index rotation strategy and consider using an index template to control future shard growth.

Shard count can only be specified when you create an index or re-index your data. Determine how many shards you need before indexing your first document.

To specify the number of shards in a new index, run a command similar to the following. For more information, see Introduction to Indexing.

$ curl -XPUT <ES_Endpoint>/index-name -H 'Content-Type: application/ json' -d'
{
      "settings": {
            "index": {
                    "number_of_shards": 3,
                    "number_of_replicas": 1                   
       }        
}'

Elasticsearch cannot change the number of primary shards in an existing index. To restructure an existing index to have a different number of shards, complete the following steps:

1.    Define an index template, which specifies the number of shards for all new indices that will be created in the cluster:

$ curl -XPUT <ES_Endpoint>/_template/template_1 -H 'Content-Type: application/json' -d'
{
      "index_patterns": ["*"],
      "settings": {
            "number_of_shards": 3,
            "number_of_replicas": 1     
      }
}'

2.    Re-index by calling the _reindex API operation. This will move your data to a new index with the number of shards that you specified in your index template.

$ curl -XPOST <ES_Endpoint>/_reindex -H 'Content-Type: application/json' -d '
{
      "source": {   
            "index": "old_index"
      },
      "dest": {
            "index": "new_index"
      }   
}'

3.    Verify that the new index and the old index have the same number of documents:

$ curl -XGET <ES_Endpoint>/_cat/indices/old_index,new_index?v

4.    If the new index and the old index have the same number of documents, run a command similar to the following to delete the old index:

$ curl -XDELETE <ES_Endpoint>/old_index

5.    Retry your configuration change.


Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2018-11-06