AWS Big Data Blog

Transition from Amazon CloudSearch to Amazon OpenSearch Service

At AWS, we are constantly innovating and evolving our services to meet the ever-changing needs of our customers. In this post, we want to help you understand the differences between Amazon CloudSearch and Amazon OpenSearch Service, and how you can transition to OpenSearch Service.

Comparing Amazon CloudSearch and Amazon OpenSearch Service

CloudSearch is a fully managed service in the cloud that makes it straightforward to set up, manage, and scale a search solution for your website or application. With CloudSearch, you can search large collections of data such as webpages, document files, forum posts, or product information. You can quickly add search capabilities without having to become a search expert or worry about hardware provisioning, setup, and maintenance. As your volume of data and traffic fluctuates, CloudSearch scales to meet your needs. CloudSearch is internally powered by a customized version of Apache Solr, and supports features such as full-text search, Boolean search, prefix search, term boosting, faceting, hit highlighting, and auto-complete suggestions.

OpenSearch Service is a managed service that makes it seamless to deploy, operate, and scale OpenSearch, a popular open source search and analytics engine. OpenSearch provides best-in-class search capabilities, providing you with all the search features of CloudSearch plus a vector engine supporting semantic search on vector embeddings, and support for both dense and sparse vectors. In addition, with OpenSearch Service, you get advanced security with fine-grained access control, the ability to store and analyze log data for observability and security, along with dashboarding and alerting. You’ll have all of CloudSearch’s capabilities and more.

With OpenSearch Serverless, you get improved, out-of-the-box, hands-free operation. Like CloudSearch, OpenSearch Serverless lets you deploy and use OpenSearch through a REST endpoint. You send your documents to OpenSearch Serverless, which indexes them for search using the OpenSearch REST API. If you want deeper control over your infrastructure for cost and latency optimization, you can choose OpenSearch Service’s managed clusters deployment option. With managed clusters, you get granular control over the instances you would like to use, indexing and data-sharding strategy, and more. OpenSearch Service brings with it the flexibility and extensibility of open source, provides powerful querying and analytics capabilities, and enables cost-effective scalability for growing workloads, with high availability and durability. For more information on the capabilities and benefits of using OpenSearch Service, see Amazon OpenSearch Service.

Transitioning to OpenSearch Service

When transitioning from CloudSearch to OpenSearch Service, you need to re-ingest and index your data into OpenSearch Service. Because OpenSearch Service uses a REST API, numerous methods exist for indexing documents. You can use standard clients like curl or any programming language that can send HTTP requests. To further simplify the process of interacting with it, OpenSearch Service has clients for many programming languages. We recommend that you use Amazon OpenSearch Ingestion to ingest data. OpenSearch Ingestion is a fully managed data collector built within OpenSearch Service that can route data to an OpenSearch Service domain or an OpenSearch Serverless collection. OpenSearch Ingestion can ingest data from a wide variety of sources, such as Amazon Simple Storage Service (Amazon S3) buckets and HTTP endpoints, and has a rich ecosystem of built-in processors to take care of your most complex data transformation needs. OpenSearch Ingestion is serverless in nature and will scale automatically to meet the requirements of your most demanding workloads, helping you focus on your business logic while abstracting away the complexity of managing complex data pipelines for your ingestion use cases. For more information about how to ingest a document into an OpenSearch Serverless collection or a managed cluster using OpenSearch ingestion, see Getting started with Amazon OpenSearch Ingestion. For detailed information on using OpenSearch Ingestion to ingest data into OpenSearch Service, refer to Amazon OpenSearch Ingestion.

Summary

AWS continues to support CloudSearch and continues to invest in security and availability improvements. However, with the advancements in OpenSearch, we recommend that you explore OpenSearch Service to get the latest search capabilities and to meet the rapid evolution of search experience users have come to expect in the machine learning age.


About the Authors

Arvind Mahesh is a Senior Manager-Product at Amazon Web Services for Amazon OpenSearch Service. He has close to two decades of technology experience across a variety of domains such as Analytics, Search, Cloud, Network Security, and Telecom.

Jon Handler is a Senior Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with OpenSearch and Amazon OpenSearch Service, providing help and guidance to a broad range of customers who have search and log analytics workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included 4 years of coding a large-scale, ecommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master of Science and a PhD in Computer Science and Artificial Intelligence from Northwestern University.