Skip to main content

Guidance for Integrated, Scalable Search for Amazon DocumentDB

Overview

This Guidance shows how to run advanced search and analytics on Amazon DocumentDB data using Amazon OpenSearch Service. For example, consider a large e-commerce company that uses Amazon DocumentDB to store product reviews as JSON documents. To enhance the customer experience, this company can develop a functionality to help customers find relevant product reviews based on their interests. Using this Guidance, large e-commerce companies can build a solution that finds reviews not only based on exact keywords but also considering synonyms and context, helping them delve deeper into data for better insights.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Get Started

Deploy this Guidance

Sample code

Use sample code to deploy this Guidance in your AWS account
Sample code

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Amazon CloudWatch logs provide enhanced monitoring. For analysis of query performance and application fine-tuning, you can use Performance Insights for Amazon DocumentDB to identify hot queries and hosts. Additionally, you can use Amazon DocumentDB audit logging to log all queries or use its profiler to log queries exceeding a specified duration. All these services and features are natively integrated with Amazon DocumentDB and help you identify performance bottlenecks and gain visibility for troubleshooting problems. You can also configure them to alert you about specific events.

Read the Operational Excellence whitepaper 

Amazon DocumentDB and OpenSearch Service support encryption at rest and TLS for data in transit. The VPC enables network isolation of these services, providing fine-grained and role-based access control through AWS Identity and Access Management (IAM) roles as well as firewall options. You can scope IAM policies to the minimum required permissions to limit unauthorized access to resources. You can also selectively encrypt sensitive data in applications by using Client-Side Field Level Encryption (CS-FLE), which uses AWS Key Management Service (AWS KMS). Finally, you can store secrets in Secrets Manager so that you don’t have to hardcode credentials in applications. Access to the secrets is audited, and we recommend enabling automatic rotation of secrets to enhance security, enabling the transition from permanent credentials to temporary ones.

Read the Security whitepaper 

This Guidance deploys Amazon DocumentDB across three Availability Zones (AZs) to provide reliable operations, and failovers to an existing replica complete in less than 30 seconds. The database backup capability provides point-in-time recovery for clusters, enabling restoration to any second during the retention period, up until the last five minutes. You can also use AWS Backup to centralize backup governance. Additionally, you can configure OpenSearch Service as a multi-AZ cluster for high availability and automatic failovers, and the Amazon OpenSearch Serverless option lets you run petabyte-scale workloads without configuring, managing, and scaling OpenSearch clusters.

Read the Reliability whitepaper 

You can use Performance Insights for Amazon DocumentDB to analyze query performance, fine-tune applications, and identify hot queries and hosts. You can also configure alerts and alarms within CloudWatch and use this service to monitor resource consumption. Monitoring resource consumption helps you decide whether to scale the cluster horizontally or vertically for Amazon DocumentDB and OpenSearch Service.

Read the Performance Efficiency whitepaper 

Amazon DocumentDB offers per-second billing for compute, free or reduced monthly backup storage costs, and no-cost encryption capabilities and monitoring. Additionally, its flexibility with JSON document storage enables agility with data model changes. CloudWatch helps you monitor resource consumption and align the scaling activities of Amazon DocumentDB and OpenSearch Service, which can scale cluster capacity in and out without overprovisioning infrastructure.

Read the Cost Optimization whitepaper 

Amazon DocumentDB and OpenSearch Service provide horizontal scalability, enabling you to reduce the use and energy of unnecessary hardware. This Guidance also uses instances powered by AWS Graviton processors, which provide a 30 percent energy reduction and are more sustainable.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.