Use CI/CD best practices to automate Amazon OpenSearch Service cluster management operations

Quick and reliable access to information is crucial for making smart business decisions. That’s why companies are turning to Amazon OpenSearch Service to power their search and analytics capabilities. OpenSearch Service makes it straightforward to deploy, operate, and scale search systems in the cloud, enabling use cases like log analysis, application monitoring, and website search.

Efficiently managing OpenSearch Service indexes and cluster resources can lead to significant improvements in performance, scalability, and reliability – all of which directly impact a company’s bottom line. However, the industry lacks built-in and well-documented solutions to automate these important operational tasks.

Applying continuous integration and continuous deployment (CI/CD) to managing OpenSearch index resources can help do that. For instance, storing index configurations in a source repository allows for better tracking, collaboration, and rollback. Using infrastructure as code (IaC) tools can help automate resource creation, providing consistency and reducing manual work. Finally, using a CI/CD pipeline can automate deployments and streamline workflow.

In this post, we discuss two options to achieve this: the Terraform OpenSearch provider and the Evolution library. Which one is best suited to your use case depends on the tooling you are familiar with, your language of choice, and your existing pipeline.

Solution overview

Let’s walk through a straightforward implementation. For this use case, we use the AWS Cloud Development Kit (AWS CDK) to provision the relevant infrastructure as described in the following architecture diagram that follows, AWS Lambda to trigger Evolution scripts and AWS CodeBuild to apply Terraform files. You can find the code for the entire solution in the GitHub repo.

Prerequisites

To follow along with this post, you need to have the following:

Familiarity with Java and OpenSearch
Familiarity with the AWS CDK, Terraform, and the command line
The following software versions installed on your machine: Python 3.12, NodeJS 20, and AWS CDK 2.170.0 or higher
An AWS account, with an AWS Identity and Access Management (IAM) role configured with the relevant permissions

Build the solution

To build an automated solution for OpenSearch Service cluster management, follow these steps:

Enter the following commands in a terminal to download the solution code; build the Java application; build the required Lambda layer; create an OpenSearch domain, two Lambda functions and a CodeBuild project; and deploy the code:

git clone https://github.com/aws-samples/opensearch-automated-cluster-management
cd opensearch-automated-cluster-management
cd app/openSearchMigration
mvn package
cd ../../lambda_layer
chmox a+x create_layer.sh
./create_layer.sh
cd ../infra
npm install
npx cdk bootstrap
aws iam create-service-linked-role --aws-service-name es.amazonaws.com
npx cdk deploy --require-approval never

Wait 15 to 20 minutes for the infrastructure to finish deploying, then check that your OpenSearch domain is up and running, and that the Lambda function and CodeBuild project have been created, as shown in the following screenshots.

Before you use automated tools to create index templates, you can verify that none already exist using the OpenSearchQuery Lambda function.

On the Lambda console, navigate to the relevant Function
On the Test tab, choose Test.

The function should return the message “No index patterns created by Terraform or Evolution,” as shown in the following screenshot.

Apply Terraform files

First, you use Terraform with CodeBuild. The code is ready for you to test, let’s look at a few important pieces of configuration:

Define the required variables for your environment:

variable "OpenSearchDomainEndpoint" {
  type = string
  description = "OpenSearch domain URL"
}

variable "IAMRoleARN" {
  type = string
  description = "IAM Role ARN to interact with OpenSearch"
}

Define and configure the provider

terraform {
  required_providers {
    opensearch = {
      source = "opensearch-project/opensearch"
      version = "2.3.1"
    }
  }
}

provider "opensearch" {
  url = "https://${var.OpenSearchDomainEndpoint}"
  aws_assume_role_arn = "${var.IAMRoleARN}"
}

NOTE: As of the publication date of this post, there is a bug in the Terraform OpenSearch provider that will trigger when launching your CodeBuild project and that will prevent successful execution. Until it is fixed, please use the following version:

terraform {
  required_providers {
    opensearch = {
      source = "gnuletik/opensearch"
      version = "2.7.0"
    }
  }
}

Create an index template

resource "opensearch_index_template" "template_1" {
  name = "cicd_template_terraform"
  body = <<EOF
{
  "index_patterns": ["terraform_index_*"],
  "template": {
    "settings": {
      "number_of_shards": "1"
    },
    "mappings": {
        "_source": {
            "enabled": false
        },
        "properties": {
            "host_name": {
                "type": "keyword"
            },
            "created_at": {
                "type": "date",
                "format": "EEE MMM dd HH:mm:ss Z YYYY"
            }
        }
    }
  }
}
EOF
}

You are now ready to test.

On the CodeBuild console, navigate to the relevant Project and choose Start Build.

The build should complete successfully, and you should see the following lines in the logs:

opensearch_index_template.template_1: Creating...
opensearch_index_template.template_1: Creation complete after 0s (id=cicd_template_terraform)
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

You can check that the index template has been properly created using the same Lambda function as earlier, and should see the following results.

Run Evolution scripts

In the next step, you use the Evolution library. The code is ready for you to test, let’s look at a few important pieces of code and configuration:

To begin with, you need to add the latest version of the Evolution core library and AWS SDK as Maven dependencies. The full xml file is available in the GitHub repository; to check the Evolution library’s compatibility with different OpenSearch versions, see here.

<dependency>
    <groupId>com.senacor.elasticsearch.evolution</groupId>
    <artifactId>elasticsearch-evolution-core</artifactId>
    <version>0.6.0</version><!--check the latest version-->
</dependency>
<dependency>
   <groupId>software.amazon.awssdk</groupId>
   <artifactId>auth</artifactId>
</dependency>

Create Evolution Bean and an AWS interceptor (which implements HttpRequestInterceptor).

Interceptors are open-ended mechanisms in which the SDK calls code that you write to inject behavior into the request and response lifecycle. The function of the AWS interceptor is to hook into the execution of API requests and create an AWS signed request stamped with proper IAM roles. You can use the following code to create your own implementation to sign all the requests made to OpenSearch within AWS.

Create your own OpenSearch client to manage automatic creation of index, mappings, templates, and aliases.

The default ElasticSearch client that comes bundled in as part of the Maven dependency can’t be used to make PUT calls to the OpenSearch cluster. Therefore, you need to bypass the default REST client instance, and add a CallBack to the AwsRequestSigningInterceptor.

The following is a sample implementation:

private RestClient getOpenSearchEvolutionRestClient() {
    return RestClient.builder(getHttpHost())
        .setHttpClientConfigCallback(hacb -> 
            hacb.addInterceptorLast(getAwsRequestSigningInterceptor()))
        .build();
}

Use the Evolution Bean to call your migrate method, which is responsible for initiating the migration of the scripts defined either using classpath or filepath:

public void executeOpensearchScripts() {
    ElasticsearchEvolution opensearchEvolution = ElasticsearchEvolution.configure()
        .setEnabled(true) // true or false
        .setLocations(Arrays.asList("classpath:opensearch_migration/base",
            "classpath:opensearch_migration/dev")) // List of all locations where scripts are located.
        .setHistoryIndex("opensearch_changelog") // Tracker index to store history of scripts executed.
        .setValidateOnMigrate(false) // true or false
        .setOutOfOrder(true) // true or false
        .setPlaceholders(Collections.singletonMap("env","dev")) // list of placeholders which will get replaced in the script during execution.
        .load(getElasticsearchEvolutionRestClient());
    opensearchEvolution.migrate();
}

An Evolution migration script represents a REST call to the OpenSearch API (for example, PUT /_index_template/cicd_template_evolution), where you define index patterns, settings, and mappings in JSON format. Evolution interprets these scripts, manages their versioning, and provides ordered execution. See the following example:

PUT /_index_template/cicd_template_evolution
Content-Type: application/json

{
  "index_patterns": ["evolution_index_*"],
  "template": {
    "settings": {
      "number_of_shards": "1"
    },
    "mappings": {
        "_source": {
            "enabled": false
        },
        "properties": {
            "host_name": {
                "type": "keyword"
            },
            "created_at": {
                "type": "date",
                "format": "EEE MMM dd HH:mm:ss Z YYYY"
            }
        }
    }
  }
}

The first two lines must be followed by a blank line. Evolution also supports comment lines in its migration scripts. Every line starting with # or // will be interpreted as a comment-line. Comment lines aren’t sent to OpenSearch. Instead, they are filtered by Evolution.

The migration script file naming convention must follow a pattern:

Start with esMigrationPrefix which is by default V or the value that has been configured using the configuration option esMigrationPrefix
Followed by a version number, which must be numeric and can be structured by separating the version parts with a period (.)
Followed by the versionDescriptionSeparator: __ (the double underscore symbol)
Followed by a description, which can be any text your filesystem supports
End with esMigrationSuffixes which is by default .http and is configurable and case insensitive

You’re now ready to execute your first automated change. An example of a migration script has already been created for you, which you can refer to in a previous section. It will create an index template named cicd_template_evolution.

On the Lambda console, navigate to your function.
On the Test tab, choose Test.

After a few seconds, the function should successfully complete. You can review the log output in the Details section, as shown in the following screenshots.

The index template now exists, and you can check that its configuration is indeed in line with the script, as shown in the following screenshot.

Clean up

To clean up the resources that were created as part of this post, run the following commands (in the infra folder):

npx cdk destroy --all

Conclusion

In this post, we demonstrated how to automate OpenSearch index templates using CI/CD practices and tools such as Terraform or the Evolution library.

To learn more about OpenSearch Service, refer to the Amazon OpenSearch Service Developer Guide. To further explore the Evolution library, refer to the documentation. To learn more about the Terraform OpenSearch provider, refer to the documentation.

We hope this detailed guide and accompanying code will help you get started. Try it out, let us know your thoughts in the comments section, and feel free to reach out to us for questions!

About the Authors

Camille Birbes is a Senior Solutions Architect with AWS and is based in Hong Kong. He works with major financial institutions to design and build secure, scalable, and highly available solutions in the cloud. Outside of work, Camille enjoys any form of gaming, from board games to the latest video game.

Sriharsha Subramanya Begolli works as a Senior Solutions Architect with AWS, based in Bengaluru, India. His primary focus is assisting large enterprise customers in modernizing their applications and developing cloud-based systems to meet their business objectives. His expertise lies in the domains of data and analytics.

AWS Big Data Blog

Use CI/CD best practices to automate Amazon OpenSearch Service cluster management operations

Solution overview

Prerequisites

Build the solution

Apply Terraform files

Run Evolution scripts

Clean up

Conclusion

About the Authors

Resources

Follow

Learn

Resources

Developers

Help