How do I rebalance the uneven shard distribution in my Amazon OpenSearch Service cluster?

Last updated: 2021-09-23

The disk space in my Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) domain is unevenly distributed across the nodes. As a result, the disk usage is heavily skewed. How do I rebalance my node distribution?

Short description

Disk usage can be heavily skewed because of the following reasons:

  • Uneven shard sizes in a cluster. Although OpenSearch Service evenly distributes the number of shards across nodes, varying shard sizes can require different amounts of disk space.
  • Available disk space on a node. (For more information, see Disk-based shard allocation on the Elasticsearch website.)
  • Incorrect shard allocation strategy. (For more information, see Demistifying OpenSearch Service shard allocation.)

To rebalance the shard allocation in your OpenSearch Service cluster, consider the following approaches:

  • Check the shard allocation, shard sizes, and index sharding strategy.
  • Be sure that shards are of equal size across the indices.
  • Keep shard sizes between 10 GB to 50 GB for better performance.
  • Add more data nodes to your OpenSearch Service cluster.
  • Update your sharding strategy.
  • Delete the old or unused indices to free up disk space.

Resolution

Check the shard allocation, shard sizes, and index sharding strategy

To check the number of shards allocated to each node and the amount of disk space used on each node, use the following API:

$ curl -XGET ES_Endpoint/_cat/allocation?v

To check the shards allocated to each node and the size of each shard, use the following API:

$ curl -XGET ES_Endpoint/_cat/shards?v

Note: This API shows that the size of shards can vary for different indices.

The uneven sharding strategy for indices can cause data skewness, where shards of bigger indices reside on only a few nodes. Check the sharding strategy for indices using the following API:

$ curl -XGET ES_Endpoint/_cat/indices?v

Be sure that shards are of equal size across the indices

If the index size varies significantly, use the rollover index API to create a new index when certain index sizes are reached. Or, you can use the Index State Management (ISM) to create a new index for OpenSearch Service versions 7.1 and later. For more information about rolling an alias using ISM, see rollover on the Open Distro website.

Keep shard sizes between 10 GB to 50 GB for better performance

If you have a large class of instances, use the Petabyte scale for Amazon OpenSearch Service to determine shard sizes. For example, an OpenSearch Service domain with several i3.16xlarge.search instances can support shard sizes of up to 100 GB because there are more resources available. For more information about sharding strategy, see Choosing the number of shards.

Add more data nodes to your OpenSearch Service cluster

If your OpenSearch Service cluster has reached high disk usage levels, then add more data nodes to your cluster. The addition of data nodes also adds more resources to improve cluster performance.

Note: OpenSearch Service doesn't automatically rebalance the cluster if there's a lack of available storage space. As a result, if a data node runs out of free storage space, the cluster blocks any writes. For more information about disk space management, see How do I add storage space to an Amazon OpenSearch Service domain?

Update your sharding strategy

By default, OpenSearch Service has a sharding strategy of 5:1, where each index is divided into five primary shards. Within each index, each primary shard also has its own replica. OpenSearch Service automatically assigns primary shards and replica shards to separate data nodes, making sure that there's a backup in case of failure.

To modify OpenSearch Service default behavior, design your indices so that shards are distributed equally by size:

  • For existing indices, use the reindex API to change the number of primary shards. The _reindex API can be used to merge smaller indices into a bigger index, or it can be used to split up the bigger index. When the bigger index is split into more primary shards, the shard sizes are decreased.
  • For new indices, use the template API to define the number of primary and replica shards.

Then, update the indices settings for your shards. For more information, see Update indices settings on the Elasticsearch website.

Delete the old or unused indices to free up disk space

OpenSearch Service versions 7.1 and later support Index State Management. With ISM, you can define custom management policies so that old or unused indices are deleted after an established duration.