Increase availability for Amazon OpenSearch Service by deploying in three Availability Zones
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
Today, Amazon OpenSearch Service announced support for deploying your domains across three Availability Zones (AZ). This feature is available in all AWS Regions that support at least three Availability Zones. With this new feature, you can spread out your master and data nodes to gain better tolerance for Availability Zone failures.
Additionally, the AWS Management Console now provides a streamlined experience that helps you tailor your domain to your use case. You can now specify the kind of deployment you want while creating your Amazon OpenSearch domain and have the service automatically pick the appropriate configurations based on your selection, as shown following.
When you choose Production, Amazon OpenSearch automatically provisions your nodes across multiple Availability Zones and prompts you to configure dedicated master instances, which is essential for providing better availability for production workloads. Similarly, when you choose Development and testing, your domain is configured to use one Availability Zone. You can choose the Custom option if you want to view all the available configurations and select the appropriate options that you need for your domain.
Regardless of the options that you pick while creating the domain, you always have the freedom to change the configuration of the domain any time later. Simply choose Configure cluster on the console, and use the wizard to pick from all the available options to fine-tune your domain, as shown following.
Choosing a three-Availability Zone deployment can affect the distribution of both master and data nodes in your cluster. Let’s take a deeper look into each of these.
Master nodes play a key role in cluster stability, and that’s why we recommend using a minimum of three dedicated master nodes for production clusters. In the past, these three master nodes were spread across two Availability Zones, which meant that one Availability Zone contained one master node, and the other contained two. If the Availability Zone with two master nodes fails, the Amazon OpenSearch domain will not achieve quorum, which blocks the cluster from operating normally.
One of the key benefits of deploying across three Availability Zones is that the master nodes are equally distributed across all the zones, as shown in the following figure. That way, even in the rare event of an Availability Zone disruption, you still have quorum because two master nodes are available.
Amazon OpenSearch automatically deploys master nodes into three Availability Zones when the following conditions are met:
- You pick the 2-AZ or 3-AZ deployment option.
- The Region physically has three Availability Zones available.
- The instance type is available in the respective zones.
Deploying your data nodes into three Availability Zones can also improve the availability of your domain. When you create a new domain, consider the following factors:
- Number of Availability Zones
- Number of replicas
- Number of nodes
With the right combinations of these three factors, you can achieve better levels of failure tolerance and availability. Let’s look at a few recommended deployment options and their corresponding benefits. For the sake of simplicity, let’s assume that you have one index with the three primary shards.
Before we discuss the benefits of three Availability Zones, let’s look at the current state of the world. Until now, we’ve been recommending that customers deploy across two Availability Zones and enable one replica (shown following).
With zone awareness enabled, Amazon OpenSearch ensures that each primary shard and its corresponding replica are allocated in different Availability Zones. This pattern provides better failure tolerance and availability because even if one zone is not available, you still have a complete copy of the data in the other zone. However, you will only have 50 percent of the instances available to process your workload.
Recommended for higher availability requirements: three Availability Zones, one replica
By deploying the data nodes across three Availability Zones with one replica enabled, your shards are distributed across three Availability Zones, leading to increased tolerance for Availability Zone failures.
In the event of a single Availability Zone failure, you lose 33 percent of your nodes. This means that 66 percent as many data nodes have to process the same number of read and write requests to your cluster. To prevent the domain from getting overloaded due to the reduced capacity, Amazon OpenSearch automatically adds more nodes to replace the lost ones.
Get more for less with three-Availability Zone deployments
Let’s assume that you have a requirement that you have 100 percent data redundancy even in the case of a single zone failure. Lets look at how you can achieve this with an index that has three primary shards.
To achieve this with two zones, you need three replicas for each shard. This leads to a total of 12 shards.
To achieve this with three zones, you need only two replicas for each shard, reducing the total number of shards to just nine.
This new deployment feature in Amazon OpenSearch provides higher availability and failure tolerance, and it gives you the flexibility to select a deployment type that suits your use case. We recommend that anyone who is running production workloads consider deploying across three Availability Zones.
About the Author
Anoop Sunke is senior solutions architect focused on Big Data Solutions at AWS. As an SA, his primary job is to help customers solve tough problems with the right technologies. In his spare time, Anoop likes to travel and explore different cuisines around the globe.