AWS Database Blog
Introducing Client-Side Field Level Encryption and MongoDB 5.0 API compatibility in Amazon DocumentDB
Amazon DocumentDB (with MongoDB compatibility) is a scalable, highly durable, and fully managed database service for operating mission-critical MongoDB-compatible JSON based workloads. On 02/MAR/2023, Amazon DocumentDB launched support for Client-Side Field Level Encryption (CSFLE), MongoDB 5.0 API compatibility, new aggregation operators, and other enhancements.
In this post, we summarize what’s new in Amazon DocumentDB and show an example of how to encrypt sensitive data in your application with CSFLE.
What’s new in Amazon DocumentDB 5.0?
Amazon DocumentDB 5.0 offers the following enhancements:
- Client-side field level encryption – With the support for CSFLE, you can now selectively encrypt sensitive data in-application using AWS Key Management Service (AWS KMS) before it is sent to the database. This is in addition to the existing features available for encrypting data at rest and in transit.
- New operators – We have added support for two new aggregation operators,
$dateAdd
and$dateSubtract
, and have updated behaviors of already supported operators to be compatible with MongoDB 5.0 API. These new operators are available in Amazon DocumentDB 5.0 and Elastic Clusters. For more information, see Supported MongoDB APIs, Operations, and Data Types. - Index enhancements – You now have the ability to use indexes with the
$elemMatch
operator. As a result, queries with$elemMatch
will now result in index scans, providing better performance for queries involving arrays. - Storage limit increase – We have increased the volume storage limit to 128 TiB from the previous limit of 64 TiB. This increase is applicable to all Amazon DocumentDB instance-based clusters (including 3.6 and 4.0 clusters) and Elastic Clusters. Now, each shard in Amazon DocumentDB Elastic Clusters will have a maximum storage capacity of 128 TiB. You pay only for the storage and I/O that your Amazon DocumentDB cluster consumes, and you don’t need to provision these resources in advance. For existing clusters, the storage limit increase gets applied automatically and requires no action to be taken.
For full release notes, see release notes.
Getting started with CSFLE
Applications that deal with sensitive data, such as personally identifiable information (PII), can now choose to encrypt only the sensitive fields before storing them in Amazon DocumentDB, improving their security posture by protecting data from data breaches and unauthorized access while also complying with privacy and regulatory requirements.
To find out a full list of which operations are supported and which are not on encrypted fields, see Client-side field level encryption.
Solution overview
Configuring and using CSFLE in Amazon DocumentDB comprises four steps:
- Create a customer managed key (CMK) using AWS KMS.
- Create an AWS Identity and Access Management (IAM) policy and associate it with the user.
- Generate a data encryption key (DEK).
- Perform read and write operations.
Prerequisites
To implement this solution, you need:
- A TLS-enabled Amazon DocumentDB 5.0 cluster (instance-based or elastic). We recommend using TLS to encrypt data in transit as a security best practice. You can use an existing cluster or create a new one.
- An AWS Secrets Manager secret where Amazon DocumentDB credentials are stored. Using Secrets Manager to manage your database credentials is another recommended best practice. For more information on using Secrets Manager with Amazon DocumentDB, see How Amazon DocumentDB (with MongoDB compatibility) uses AWS Secrets Manager.
- An IAM user. You can use an existing IAM user or create a new user. For this post, we use an existing cluster and IAM user
democsfle
.
Create a CMK using AWS KMS
To create your encryption key, follow these steps:
- On the AWS KMS console, choose Customer-managed keys in the navigation pane.
- Choose Create key.
- For Key type, select Symmetric.
- For Key usage, select Encrypt and decrypt.
- Choose Next.
- Enter an alias, such as
csflecmk
, and an optional description. - Choose Next.
- For Key administrators, select the IAM users and roles that can administer the key.
- Choose Next.
- Optionally, select the IAM users and roles that can define the keys.
- Choose Next.
- On the Review and create page, review the choices you made and choose Finish.
Create an IAM policy and associate it with an IAM user
You need an IAM policy that allows the IAM user democsfle
to use the key created in the previous step. For this post, we name the policy csfledemopolicy
, and specify it to allow encrypt and decrypt actions for the key.
- On the IAM console, choose Policies in the navigation pane.
- Choose Create policy.
- On the JSON tab, enter the following policy (provide the ARN of the CMK you created):
- Choose Next: Tags.
- Optionally, add any tags to your policy.
- Choose Next: Review.
- Enter a name and optional description.
- Review the policy and choose Create policy.
This completes the IAM policy creation. The next step is to add the IAM policy csfledemopolicy
to the IAM user democsfle
.
- Choose Users in the navigation pane.
- Search for and choose the user
democsfle
.
- In the Permissions policies section, choose Add permissions.
- For Permissions options, select Attach policies directly.
- Search for and select the policy
csfledemopolicy
. - Choose Next.
- Review the permission summary and choose Add permissions.
Note: You may also need to provide required permissions to the IAM user to read the secret containing Amazon DocumentDB credentials from Secrets Manager.
Generate a data encryption key
We use a DEK to encrypt data in-application before it is sent to the database. The DEK is stored in a key vault collection of your choice in Amazon DocumentDB. DEKs are encrypted with a CMK. You can generate multiple DEKs to encrypt multiple fields. It’s important to note that you can’t decrypt your encrypted data without the DEK that was used to encrypt it.
For this task, you create the Python script democsfle.py
with functions to retrieve credentials and generate DEK. The following table lists the attributes to use in the functions.
Type | Attribute | Description |
AWS KMS Key details | kmsKeyArn |
The ARN of the CMK that we created. |
awsRegion |
The Region in which the CMK is available. | |
DEK details | key_vault_namespace |
The namespace (<database>.<collection> ) of the key vault in which the DEK is stored in Amazon DocumentDB. For example, we use encr.dekKeys . |
keyAltName |
The name for the DEK key that is going to be generated by the script and is stored as a document in key_vault_namespace . This value should be unique in the key_vault_namespace . For example, we use demo_encr_email as the DEK in this post. |
|
Amazon DocumentDB Credentials | secret_name |
The name of the secret in Secrets Manager where Amazon DocumentDB credentials are stored. |
tlsCAFile |
The SSL certificate used to encrypt data in transit. For this post, this is rds-combined-ca-bundle.pem . You can download this certificate and place it in the script. For more information, see Connecting with TLS Enabled. |
Complete the following steps to create functions to retrieve database credentials from Secrets Manager and generate a DEK:
- In your preferred text editor, create the file
democsfle.py
and enter the following code:
Note: You may need to configure your AWS CLI for the IAM user.
- Next, you need a function to generate DEK, so let’s create a function named
generate_keys()
and append to the same file (democsfle.py
):
Perform read and write operations
In this task, you create a function to insert documents into a collection with encrypted and unencrypted fields and read the data using DEK created by the function generate_keys()
. This function takes an additional attribute: a user namespace (<database>.<collection>
) to store the user data along with the attributes used in the previous functions. We use gamesDB.users
as the namespace where the user data is stored.
Append the following function to perform read and write operations to democsfle.py
:
This completes the creation of the script democsfle.py
. The next step is to run the script to see it in action.
- Generate the DEK with the following code:
- Perform read and write operations:
Clean up
If you created a new Amazon DocumentDB cluster, you can stop the cluster or delete the cluster. If you created a new IAM user, you can deactivate or delete the user if you’re not using that user elsewhere.
Summary
In this post we introduced the new features in Amazon DocumentDB and showed you how to use client-side field encryption in your application with an example. For more information about recent launches and blog posts, see Amazon DocumentDB (with MongoDB compatibility) resources.
About the authors
Kaarthiik Thota is a Senior DocumentDB Specialist Solutions Architect at AWS based out of London. He is passionate about database technologies and enjoys helping customers solve problems and modernize applications leveraging NoSQL databases. Before joining AWS, he worked extensively with relational databases, NoSQL databases, and Business Intelligence technologies for more 14 years