Why is my Amazon Elasticsearch Service (Amazon ES) domain stuck in the "Processing" state?

Last updated: 2020-08-21

My Amazon Elasticsearch Service (Amazon ES) domain is stuck in the "Processing" state after I made a configuration change. Why is my domain stuck, and what can I do to prevent it from happening again?

Short description

Your Amazon ES domain might be stuck in the "Processing" state for several reasons:

  • Failed cluster nodes
  • Uneven distribution of nodes
  • Shard reallocation (using the index.routing.allocation.require parameter)
  • High CPU utilization
  • JVM memory pressure (from large query loads, disk problems, and node failures)
  • A stuck snapshot

It's a best practice to check the number of shards and shard assignments before performing additional troubleshooting on your Amazon ES domain.

For more information about shard reallocation and updating its required parameters, see Index-level shard allocation filtering on the Elasticsearch website.

Resolution

Uneven distribution of nodes

Amazon ES migrates shards during snapshot recoveries, domain configuration changes, replication level changes, node failures, and node startup. If there are too many shards on the cluster during a blue/green deployment, it can slow down the process of shard migration. High CPU utilization and high JVM memory pressure can also slow down the migration process, causing the cluster to get stuck in the "Processing" state. As a result, the shards become "Unassigned" if enough nodes aren't evenly distributed across the number of shards. For more information about blue/green deployment, see Configuration changes.

If your Elasticsearch cluster gets stuck in the "Processing" state, consider reducing the number of shards in your cluster. For more information about designating the appropriate number of shards, see Choosing the number of shards.

To determine whether shards are migrating, type the following command syntax:

$ curl -XGET "ES_Endpoint/_cat/recovery?v&active_only"

Note: Wait a few minutes before running the command again. If there's a change in the output, then the shards are still being migrated.

Then, check the number of shards allocated to each node and the amount of disk space in use:

$ curl -XGET ES_Endpoint/_cat/allocation?v

Note: This command syntax also indicates whether the cluster node failed due to high disk usage or JVM memory pressure.

Unassigned shards

To check the number of shards and indices in your cluster, use the following syntax:

$ curl -XGET ES_Endpoint/_cat/indices
$ curl -XGET ES_Endpoint/_cat/shards

After you identify the unassigned shards, minimize the number of unnecessary shards in your domain. For more information about shard calculations, see Get started with Amazon Elasticsearch Service: How many shards do I need?

Reduce the number shards in the cluster

To reduce the number of shards in your cluster or any overhead, remove all old or outdated indices:

$ curl -XDELETE ES_Endpoint/oldindex1,oldindex2

If you have an index rotation configured, each rotation results in a newly created set of shards. By default, Amazon ES creates five primary shards per index and one replica shard for every primary shard. This index rotation can eventually lead to an overloaded leader node. To avoid the overload and to control future shard growth, consider changing your index rotation strategy or use an index template.

Note: You can specify shard count only when you create a new index or re-index your existing data. Before indexing your document, choose the number of shards.

To specify the number of shards in a new index, run the following command:

$ curl -XPUT ES_Endpoint/index-name -H 'Content-Type: application/ json' -d'
{
      "settings": {
            "index": {
                    "number_of_shards": 3,
                    "number_of_replicas": 1                   
       }        
}'

Note: Amazon ES can't change the number of primary shards in an existing index. For more information about indexing data in Amazon ES, see Introduction to indexing.

To restructure an existing index, perform the following tasks:

1.    Create and define a new index template. The following syntax specifies the number of new shards created in the cluster:

$ curl -XPUT ES_Endpoint/_template/template_1 -H 'Content-Type: application/json' -d'
{
      "index_patterns": ["*"],
      "settings": {
            "number_of_shards": 3,
            "number_of_replicas": 1     
      }
}'

2.    Re-index by calling the _reindex API operation (Amazon ES versions 5.1 and later). The following syntax moves your data into a new index, specifying the new number of shards in your index template:

$ curl -XPOST ES_Endpoint/_reindex -H 'Content-Type: application/json' -d '
{
      "source": {   
            "index": "old_index"
      },
      "dest": {
            "index": "new_index"
      }   
}'

Important: If your access policy includes AWS Identity and Access Management (IAM) users or roles, you must sign HTTP requests to the Amazon Elasticsearch Service APIs.

3.    Verify that the new and old indices have the same number of documents:

$ curl -XGET ES_Endpoint/_cat/indices/old_index,new_index?v

4.    When the new and old indices show the same number of documents, you can then delete the old index:

$ curl -XDELETE ES_Endpoint/old_index

5.    Re-run your configuration update. For more information about configuration changes, see About configuration changes.