Improve your Amazon OpenSearch Service performance with OpenSearch Optimized Instances

Amazon OpenSearch Service introduced the OpenSearch Optimized Instances (OR1), deliver price-performance improvement over existing instances. The newly introduced OR1 instances are ideally tailored for heavy indexing use cases like log analytics and observability workloads.

OR1 instances use a local and a remote store. The local storage utilizes either Amazon Elastic Block Store (Amazon EBS) of type gp3 or io1 volumes, and the remote storage uses Amazon Simple Storage Service (Amazon S3). For more details about OR1 instances, refer to Amazon OpenSearch Service Under the Hood: OpenSearch Optimized Instances (OR1).

In this post, we conduct experiments using OpenSearch Benchmark to demonstrate how the OR1 instance family improves indexing throughput and overall domain performance.

Getting started with OpenSearch Benchmark

OpenSearch Benchmark, a tool provided by the OpenSearch Project, comprehensively gathers performance metrics from OpenSearch clusters, including indexing throughput and search latency. Whether you’re tracking overall cluster performance, informing upgrade decisions, or assessing the impact of workflow changes, this utility proves invaluable.

In this post, we compare the performance of two clusters: one powered by memory-optimized instances and the other by OR1 instances. The dataset comprises HTTP server logs from the 1998 World Cup website. With the OpenSearch Benchmark tool, we conduct experiments to assess various performance metrics, such as indexing throughput, search latency, and overall cluster efficiency. Our aim is to determine the most suitable configuration for our specific workload requirements.

You can install OpenSearch Benchmark directly on a host running Linux or macOS, or you can run OpenSearch Benchmark in a Docker container on any compatible host.

OpenSearch Benchmark includes a set of workloads that you can use to benchmark your cluster performance. Workloads contain descriptions of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains indexes, data files, and operations invoked when the workflow runs.

When assessing your cluster’s performance, it is recommended to use a workload similar to your cluster’s use cases, which can save you time and effort. Consider the following criteria to determine the best workload for benchmarking your cluster:

Use case – Selecting a workload that mirrors your cluster’s real-world use case is essential for accurate benchmarking. By simulating heavy search or indexing tasks typical for your cluster, you can pinpoint performance issues and optimize settings effectively. This approach makes sure benchmarking results closely match actual performance expectations, leading to more reliable optimization decisions tailored to your specific workload needs.
Data – Use a data structure similar to that of your production workloads. OpenSearch Benchmark provides examples of documents within each workload to understand the mapping and compare with your own data mapping and structure. Every benchmark workload is composed of the following directories and files for you to compare data types and index mappings.
Query types – Understanding your query pattern is crucial for detecting the most frequent search query types within your cluster. Employing a similar query pattern for your benchmarking experiments is essential.

Solution overview

The following diagram explains how OpenSearch Benchmark connects to your OpenSearch domain to run workload benchmarks.

The workflow comprises the following steps:

The first step involves running OpenSearch Benchmark using a specific workload from the workloads repository. The invoke operation collects data about the performance of your OpenSearch cluster according to the selected workload.
OpenSearch Benchmark ingests the workload dataset into your OpenSearch Service domain.
OpenSearch Benchmark runs a set of predefined test procedures to capture OpenSearch Service performance metrics.
When the workload is complete, OpenSearch Benchmark outputs all related metrics to measure the workload performance. Metric records are by default stored in memory, or you can set up an OpenSearch Service domain to store the generated metrics and compare multiple workload executions.

In this post, we used the http_logs workload to conduct performance benchmarking. The dataset comprises 247 million documents designed for ingestion and offers a set of sample queries for benchmarking. Follow the steps outlined in the OpenSearch Benchmark User Guide to deploy OpenSearch Benchmark and run the http_logs workload.

Prerequisites

You should have the following prerequisites:

Minimum knowledge of the Python programming language.
An active OpenSearch Service domain. For instructions, see Creating and managing Amazon OpenSearch Service domains.
A Python client set up to deploy OpenSearch Benchmark and interact with the OpenSearch Service domain.

In this post, we deployed OpenSearch Benchmark in an AWS Cloud9 host using an Amazon Linux 2 instance type m6i.2xlarge with a capacity of 8 vCPUs, 32 GiB memory, and 512 TiB storage.

Performance analysis using the OR1 instance type in OpenSearch Service

In this post, we conducted a performance comparison between two different configurations of OpenSearch Service:

Configuration 1 – Cluster manager nodes and three data nodes of memory-optimized r6g.large instances
Configuration 2 – Cluster manager nodes and three data nodes of or1.larges instances

In both configurations, we use the same number and type of cluster manager nodes: three c6g.xlarge.

You can set up different configurations with the supported instance types in OpenSearch Service to run performance benchmarks.

The following table summarizes our OpenSearch Service configuration details.

	Configuration 1	Configuration 2
Number of cluster manager nodes	3	3
Type of cluster manager nodes	c6g.xlarge	c6g.xlarge
Number of data nodes	3	3
Type of data node	r6g.large	or1.large
Data node: EBS volume size (GP3)	200 GB	200 GB
Multi-AZ with standby enabled	Yes	Yes

Now let’s examine the performance details between the two configurations.

Performance benchmark comparison

The http_logs dataset contains HTTP server logs from the 1998 World Cup website between April 30, 1998 and July 26, 1998. Each request consists of a timestamp field, client ID, object ID, size of the request, method, status, and more. The uncompressed size of the dataset is 31.1 GB with 247 million JSON documents. The amount of load sent to both domain configurations is identical. The following table displays the amount of time taken to run various aspects of an OpenSearch workload on our two configurations.

Category	Metric Name	Configuration 1 *(3 r6g.large data nodes) Runtimes**	Configuration 2 *(3 or1.large data nodes) Runtimes**	Performance Difference
Indexing	Cumulative indexing time of primary shards	207.93 min	142.50 min	31%
Indexing	Cumulative flush time of primary shards	21.17 min	2.31 min	89%
Garbage Collection	Total Young Gen GC time	43.14 sec	24.57 sec	43%
bulk-index-append	p99 latency	10857.2 ms	2455.12 ms	77%
query-Mean Throughput		29.76 ops/sec	36.24 ops/sec	22%
query-match_all(default)	p99 latency	40.75 ms	32.99 ms	19%
query-term	p99 latency	7675.54 ms	4183.19 ms	45%
query-range	p99 latency	59.5316 ms	51.2864 ms	14%
query-hourly_aggregation	p99 latency	5308.46 ms	2985.18 ms	44%
query-multi_term_aggregation	p99 latency	8506.4 ms	4264.44 ms	50%

The benchmarks show a notable enhancement across various performance metrics. Specifically, OR1.large data nodes demonstrate a 31% reduction in indexing time for primary shards compared to r6g.large data nodes. OR1.large data nodes also exhibit a 43% improvement in garbage collection efficiency and significant enhancements in query performance, including term, range, and aggregation queries.

The extent of improvement depends on the workload. Therefore, make sure to run custom workloads as expected in your production environments in terms of indexing throughput, type of search queries, and concurrent requests.

Migration journey to OR1

The OR1 instance family is available in OpenSearch Service 2.11 or higher. Usually, if you’re using OpenSearch Service and you want to benefit from new released features in a specific version, you would follow the supported upgrade paths to upgrade your domain.

However, to use the OR1 instance type, you need to create a new domain with OR1 instances and then migrate your existing domain to the new domain. The migration journey to OpenSearch Service domain using an OR1 instance is similar to a typical OpenSearch Service migration scenario. Critical aspects involve determining the appropriate size for the target environment, selecting suitable data migration methods, and devising a seamless cutover strategy. These elements provide optimal performance, smooth data transition, and minimal disruption throughout the migration process.

To migrate data to a new OR1 domain, you can use the snapshot restore option or use Amazon OpenSearch Ingestion to migrate the data for your source.

For instructions on migration, refer to Migrating to Amazon OpenSearch Service.

Clean up

To avoid incurring continued AWS usage charges, make sure you delete all the resources you created as part of this post, including your OpenSearch Service domain.

Conclusion

In this post, we ran a benchmark to review the performance of the OR1 instance family compared to the memory-optimized r6g instance. We used OpenSearch Benchmark, a comprehensive tool for gathering performance metrics from OpenSearch clusters.

Learn more about how OR1 instances work and experiment with OpenSearch Benchmark to make sure your OpenSearch Service configuration matches your workload demand.

About the Authors

Jatinder Singh is a Senior Technical Account Manager at AWS and finds satisfaction in aiding customers in their cloud migration and innovation endeavors. Beyond his professional life, he relishes spending moments with his family and indulging in hobbies such as reading, culinary pursuits, and playing chess.

Hajer Bouafif is an Analytics Specialist Solutions Architect at Amazon Web Services. She focuses on Amazon OpenSearch Service and helps customers design and build well-architected analytics workloads in diverse industries. Hajer enjoys spending time outdoors and discovering new cultures.

Puneetha Kumara is a Senior Technical Account Manager at AWS, with over 15 years of industry experience, including roles in cloud architecture, systems engineering, and container orchestration.

Manpreet Kour is a Senior Technical Account Manager at AWS and is dedicated to ensuring customer satisfaction. Her approach involves a deep understanding of customer objectives, aligning them with software capabilities, and effectively driving customer success. Outside of her professional endeavors, she enjoys traveling and spending quality time with her family.

AWS Big Data Blog