How do I rebalance the uneven shard distribution in my Amazon Elasticsearch Service cluster?

Last updated: 2020-08-07

The disk space in my Amazon Elasticsearch Service (Amazon ES) domain is unevenly distributed across the nodes. As a result, the disk usage is heavily skewed. How do I rebalance my node distribution?

Short description

Disk usage can be heavily skewed because of the following reasons:

  • Uneven shard sizes in a cluster. Although Amazon ES evenly distributes the number of shards across nodes, varying shard sizes can require different amounts of disk space.
  • Available disk space on a node. (For more information, see Disk-based shard allocation on the Elasticsearch website.)
    Incorrect shard allocation strategy. (For more information, see Demistifying Elasticsearch shard allocation.)

To rebalance the shard allocation in your Elasticsearch cluster, consider the following approaches:

  • Check the shard allocation, shard sizes, and index sharding strategy.
  • Be sure that shards are of equal size across the indices.
  • Keep shard sizes between 10 GB to 50 GB for better performance.
  • Add more data nodes to your Elasticsearch cluster.
  • Update your sharding strategy.
  • Delete the old or unused indices to free up disk space.

Resolution

Check the shard allocation, shard sizes, and index sharding strategy

To check the number of shards allocated to each node and the amount of disk space used on each node, use the following API:

$ curl -XGET ES_Endpoint/_cat/allocation?v

To check the shards allocated to each node and the size of each shard, use the following API:

$ curl -XGET ES_Endpoint/_cat/shards?v

Note: This API shows that the size of shards can vary for different indices.

The uneven sharding strategy for indices can cause data skewness, where shards of bigger indices reside on only a few nodes. Check the sharding strategy for indices using the following API:

$ curl -XGET ES_Endpoint/_cat/indices?v

Be sure that shards are of equal size across the indices

If the index size varies significantly, use the rollover index API to create a new index when certain index sizes are reached. Or, you can use the Index State Management (ISM) to create a new index for Amazon ES versions 7.1 and later. For more information about rolling an alias using ISM, see rollover on the Elasticsearch website.

Keep shard sizes between 10 GB to 50 GB for better performance

If you have a large class of instances, use the Petabyte scale for Amazon Elasticsearch Service to determine shard sizes. For example, an Amazon ES domain with several i3.16xlarge.elasticsearch instances can support shard sizes of up to 100 GB because there are more resources available. For more information about sharding strategy, see Choosing the number of shards.

Add more data nodes to your Elasticsearch cluster

If your Elasticsearch cluster has reached high disk usage levels, then add more data nodes to your cluster. The addition of data nodes also adds more resources to improve cluster performance.

Note: Amazon ES doesn't automatically rebalance the cluster if there's a lack of available storage space. As a result, if a data node runs out of free storage space, the cluster blocks any writes. For more information about disk space management, see How do I add storage space to an Amazon Elasticsearch Service (Amazon ES) domain?

Update your sharding strategy

By default, Amazon ES has a sharding strategy of 5:1, where each index is divided into five primary shards. Within each index, each primary shard also has its own replica. Amazon ES automatically assigns primary shards and replica shards to separate data nodes, making sure that there's a backup in case of failure.

To modify Amazon ES default behavior, design your indices so that shards are distributed equally by size:

  • For existing indices, use the reindex API to change the number of primary shards. The _reindex API can be used to merge smaller indices into a bigger index, or it can be used to split up the bigger index. When the bigger index is split into more primary shards, the shard sizes are decreased.
  • For new indices, use the template API to define the number of primary and replica shards.

Then, update the indices settings for your shards. For more information, see Update indices settings on the Elasticsearch websites.

Delete the old or unused indices to free up disk space

Amazon ES versions 7.1 and later support Index State Management. With ISM, you can define custom management policies so that old or unused indices are deleted after an established duration.