Guidance for Real-Time Text Search Using Amazon OpenSearch Service

This Guidance enables you to integrate Amazon DynamoDB with Amazon OpenSearch Service to enable real-time search. Most applications should use Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service. For applications with requirements that do not align with zero-ETL integration, this Guidance demonstrates how to perform an initial load of data from DynamoDB into OpenSearch Service through parallel functions and how to replicate new data into OpenSearch Service. By keeping data in both places, you can target queries to the database best suited to your requirements: DynamoDB powers any fixed access patterns that require performance and scalability, and OpenSearch Service powers access patterns that require flexibility in searching and filtering.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF

Guidance Architecture Diagram for Real-Time Text Search Using Amazon OpenSearch Service

Initial Load

Step 1
To process existing data, an AWS Lambda function is invoked to describe the Amazon DynamoDB table and split it into a number of segments based on the returned item count. The function writes one message to an Amazon Simple Queue Service (Amazon SQS) queue for each segment number.

Initial Load

Get Started

Deploy this Guidance

Sample code

Use sample code to deploy this Guidance in your AWS account

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

AWS Cloud Development Kit (AWS CDK) defines the infrastructure for the solution as code, helping you achieve consistent deployment. Lambda divides work into smaller units of work, each responsible for a different application function. These single-task functions reduce human error and support small incremental changes that are easier to reverse if they fail.

Read the Operational Excellence whitepaper
Security

Where applicable, this Guidance launches services in private Amazon Virtual Private Cloud (Amazon VPC) networks rather than public. Private networking through Amazon VPC supports security at all layers by letting you control how data is accessed. Additionally, the use of single-purpose, least-privilege AWS Identity and Access Management (IAM) policies helps you prevent permission changes from having broader, unanticipated consequences and reduces the risk of users mishandling sensitive data. AWS Secrets Manager generates and securely stores admin secrets, preventing users from storing credentials in code or environment variables where they are at risk of exposure.

Read the Security whitepaper
Reliability

Amazon SQS provides an automatic retry mechanism if a portion of the import fails, helping you quickly recover from failures. As the system of record, DynamoDB uses point-in-time recovery for continuous backup, enabling recovery to any second within the last 35 days. OpenSearch Service helps you prevent drift between the two databases by using the “create” operation for initial data loading, preventing older data from overwriting newer data. OpenSearch Service is set to use a single-node cluster, but you can change this to a multi–Availability Zone cluster to maintain availability in production.

Read the Reliability whitepaper
Performance Efficiency

Lambda enables you to parallelize workloads: reads from DynamoDB go through segmented parallel scans split across multiple Lambda function invocations. This parallelization enables significantly higher throughput than a single thread could manage.

Read the Performance Efficiency whitepaper
Cost Optimization

Lambda reads DynamoDB items together in a batch rather than as individual GetItem requests. As a result, this Guidance consumes fewer read capacity units. By lowering the amount of work spent on tasks like initializing connections, the use of batches reduces compute time and the number of Lambda invocations, lowering your compute costs. Additionally, OpenSearch Service batch operations are efficient, helping you reduce the overall cost of compute resources.

Read the Cost Optimization whitepaper
Sustainability

Lambda only invokes functions when data needs to be moved into OpenSearch Service and does not run while idle. This helps you maximize your utilization of compute resources. Additionally, as a serverless, managed service, DynamoDB helps reduce inefficiencies and decrease the total power consumed by your workloads.

Read the Sustainability whitepaper

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?

Feedback

Select your cookie preferences

[SEO Subhead]

Architecture Diagram

Get Started

Deploy this Guidance

Sample code

Well-Architected Pillars

Related Content

Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now available

Disclaimer

Was this page helpful?

Select your cookie preferences

Guidance for Real-Time Text Search Using Amazon OpenSearch Service

[SEO Subhead]

Architecture Diagram

Get Started

Deploy this Guidance

Sample code

Well-Architected Pillars

Related Content

Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now available

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer