Amazon DocumentDB (with MongoDB compatibility) Documentation

Amazon DocumentDB (with MongoDB compatibility) is a document database service designed for JSON data management at scale. This scalable service offers customers durability when operating MongoDB workloads.

In Amazon DocumentDB, storage scales automatically up to 128 TiB in Instance-based Clusters and 4 PiB in Amazon DocumentDB Elastic Clusters. Amazon DocumentDB supports millions of requests per second with up to 15 low latency read replicas in minutes.

Amazon DocumentDB is designed for a 99.9% SLA. It is designed to make your data durable across three Availability Zones (AZs) within a Region plus an additional concurrent storage node in a different AZ. By replicating new writes six ways. Amazon DocumentDB is designed to be resilient to failures and data loss failovers within a Region.

Customers can use AWS Database Migration Service (DMS) to migrate self-managed MongoDB databases to Amazon DocumentDB.

Performance at scale

Amazon DocumentDB Elastic Clusters

Amazon DocumentDB Elastic Clusters enables customers to handle millions of writes and reads per second, allowing customers to scale their document databases quickly. Customers can also store petabytes of data. 

High Throughput, Low Latency for Document Queries

Amazon DocumentDB has a JSON document model, data types, and indexing. The service uses a scale-up, in-memory optimized architecture designed to allow for fast query evaluation over large document sets.

Scaling of Database Compute Resources

Through the AWS Management Console, customers can scale the compute and memory resources, up or down by creating new replica instances of the desired size or by removing instances. Compute scaling operations complete quickly.

Storage that Scales

Amazon DocumentDB will grow the size of your storage volume as your cluster storage needs grow. The storage volume will grow in increments of 10 GB up to a maximum of 4 PiB. This is designed so that customers don't need to provision excess storage for the document database to handle future growth.

Low Latency Read Replicas

Increase read throughput to support high volume application requests by creating up to 15 database read replicas. Amazon DocumentDB replicas share the same underlying storage as the source instance. This feature is designed to free up more processing power to serve read requests and reduces the replica lag time. Amazon DocumentDB is also designed to provide a single endpoint for read queries, so the application can connect without having to keep track of replicas as they are added and removed.

MongoDB-compatible

Amazon DocumentDB is compatible with MongoDB 3.6, 4.0, and 5.0 drivers and tools. Many of the applications, drivers, and tools that customers already use today with their open source MongoDB non-relational database can be used with Amazon DocumentDB. Amazon DocumentDB emulates the responses that a client expects from a MongoDB server by implementing the Apache 2.0 open source MongoDB 3.6, 4.0, and 5.0 APIs on a purpose-built, distributed, fault-tolerant, and self-healing.storage system that is designed to give customers performance, scalability, and availability when operating MongoDB workloads at scale.

Geospatial Query Capabilities

Geospatial query capabilities enables customers to use Amazon DocumentDB to support storing, querying and indexing Geospatial data. Customers can create 2dsphere indexes and use popular MongoDB geospatial APIs such as $nearSphere, $geoNear, $minDistance, $maxDistance to perform queries on data stored in DocumentDB. 

ACID Transactions

ACID (atomicity, consistency, isolation, durability) is set of properties of databases transactions designed to help maintain data validity despite errors, power failures, and other mishaps. With the launch of support for MongoDB 4.0 compatibility, Amazon DocumentDB supports the ability to perform ACID transactions across multiple documents, statements, collections, and databases.

Migration support

Customers can migrate their MongoDB databases on-premises or on Amazon EC2 to Amazon DocumentDB with minimal downtime using the AWS Database Migration Service (DMS). With DMS, customers can migrate from a MongoDB replica set or from a sharded cluster to Amazon DocumentDB.

Managed

Provisioning and setup

Get started with Amazon DocumentDB by launching a new Amazon DocumentDB cluster using the AWS Management Console. Amazon DocumentDB instances are pre-configured with parameters and settings appropriate for the instance class selected. Customers can launch a cluster and connect the application without additional configuration.

Monitoring and Metrics

Amazon DocumentDB is designed to provide Amazon CloudWatch metrics for cloud database instances. Customers can use the AWS Management Console to view over 40 key operational metrics for the cluster, including compute, memory, storage, query throughput, MongoDB opcounters, and active connections.

Software Patching

Amazon DocumentDB is designed to keep customers’ database up-to-date with the latest patches. Customers can control if and when the cluster is patched via Database Engine Version Management.

Security

Network Isolation

Amazon DocumentDB runs in Amazon Virtual Private Cloud (VPC), which helps customers isolate the cluster in the virtual network and connect to on-premises IT infrastructure using encrypted IPsec virtual private networks (VPNs). In addition, using Amazon DocumentDB’s VPC configuration, customers can configure firewall settings and control network access to the cluster.

Authorization

Amazon DocumentDB supports role-based access control (RBAC) with built-in roles. RBAC helps customers to enforce least privilege by restricting the actions that users are authorized to perform. Amazon DocumentDB is integrated with AWS Identity and Access Management (IAM) and helps provide customers the ability to control the actions that AWS IAM users and groups can take on specific Amazon DocumentDB resources, including clusters, instances, snapshots, and parameter groups. In addition, customers can tag Amazon DocumentDB resources, and control the actions that IAM users and groups can take on groups of resources that have the same tag (and tag value).

Encryption

Amazon DocumentDB allows customers to encrypt databases using keys created and controlled through AWS Key Management Service (KMS). On a cluster running with Amazon DocumentDB encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. By default, connections between a client and Amazon DocumentDB are encrypted-in-transit with TLS.

Availability

Global Clusters

Amazon DocumentDB Global Clusters are designed to provide disaster recovery from region-wide outages and enables low-latency global reads. Amazon DocumentDB Global Clusters replicates data to clusters in up to 5 AWS regions with minimal impact on performance.

Instance Monitoring and Repair

The health of each Amazon DocumentDB cluster and its instances are continuously monitored. If the instance powering your database fails, the instance and associated processes are restarted. Amazon DocumentDB recovery does not require the potentially lengthy replay of database redo logs, so instance restart times are fast. It also isolates the database cache from database processes, allowing the cache to survive a database restart.

Multi-AZ Deployments with Read Replicas

If there is instance failure, Amazon DocumentDB is designed to automate failover to one of up to 15 Amazon DocumentDB replicas created in any of three Availability Zones. If no Amazon DocumentDB replicas have been provisioned, in the case of a failure, Amazon DocumentDB will attempt to create a new instance for customers.

Fault-tolerant Storage

Each 10GB portion of the storage volume is replicated six ways, across three Availability Zones (AZs). Amazon DocumentDB uses storage that is designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Amazon DocumentDB’s storage data blocks and disks are continuously scanned for errors and replaced.

Continuous, Incremental Backups and Point-in-time Restore

Amazon DocumentDB's simple database backup capability enables point-in-time recovery for clusters. This is designed to allow customers to restore the cluster to any second during the retention period, up until the last five minutes. The backup retention period can be configured up to thirty-five days. Automatic backups are stored in Amazon Simple Storage Service (S3), which is designed for extremely high durability. Amazon DocumentDB backups are automatic incremental, and continuous and have virtually no impact on cluster performance.  

Cluster Snapshots

Cluster snapshots are user-initiated backups of the cluster stored in Amazon S3 that will be kept until explicitly deleted. They leverage the incremental snapshots to reduce the time and storage required. Customers can create a new cluster from a Cluster Snapshot whenever desired.

Generative AI and machine learning

Amazon DocumentDB offers capabilities to enable machine learning (ML) and generative artificial intelligence (AI) models to work with data stored in Amazon DocumentDB in real time. These are designed to help reduce the time Customers spend managing separate infrastructure, writing code to connect with another service, and duplicating data from their primary database.

With vector search for Amazon DocumentDB, customers can store, index, and search vectors with fast response times. A vector is a numerical representation that represents the semantic meaning of unstructured data such as text, images, and video. Customers can store vectors from Amazon Bedrock, Amazon SageMaker, and other third party or propriety models.  

No-code machine learning with Amazon DocumentDB and Amazon SageMaker Canvas

Amazon DocumentDB integrates with Amazon SageMaker Canvas, allowing customers to build generative applications using data stored in Amazon DocumentDB. The in-console integration is designed to help customers accelerate AI/ML development with a low code no code (LCNC) experience.

Customers can build AI/ML models for classic use cases  or create generative AI solutions within SageMaker Canvas.

Additional Information

For additional information about service controls, security features and functionalities, including, as applicable, information about storing, retrieving, modifying, restricting, and deleting data, please see https://docs.aws.amazon.com/index.html. This additional information does not form part of the Documentation for purposes of the AWS Customer Agreement available at http://aws.amazon.com/agreement, or other agreement between you and AWS governing your use of AWS’s services.