Performing SQL database client-side encryption for multi-Region high availability

Important Update:
On 06/16/2021 AWS Key Management Service (AWS KMS) introduced multi-Region keys, a new capability that lets you replicate keys from one AWS Region into another. With multi-Region keys, you can more easily move encrypted data between Regions without having to decrypt and re-encrypt with different keys in each Region. Multi-Region keys are supported for client-side encryption in the AWS Encryption SDK, AWS S3 Encryption Client, and AWS DynamoDB Encryption Client. They simplify any process that copies protected data into multiple Regions, such as disaster recovery/backup, DynamoDB global tables, or for digital signature applications that require the same signing key in multiple Regions. Check the KMS User Guide for more information.

Amazon Relational Database Service (RDS) and Amazon Aurora natively provide encryption at rest to protect the underlying storage of database instances, automated backups, Read Replicas, and snapshots. However, some customers may have greater data protection requirements which require encrypting data in use.

For example, encryption is required where tokenization solutions do not fit, such as when securely storing and then reading a primary account number.

In another example, customers must prevent insiders, such as database administrators, from viewing sensitive information (for example, a Social Security number or bank account number) stored in database columns.

In these encryption scenarios, you do not require executing SQL WHERE clause predicate queries over the encrypted column data. You can use client-side encryption before persisting to a SQL database to enable column level encryption.

In this post, I walk through one possible approach to client-side encryption with SQL databases. Your encryption keys are protected by AWS Key Management Service (KMS), enabling you to control the keys needed for decryption. I then demonstrate an example application that performs client-side encryption before writing to an Amazon Aurora MySQL database engine.

Overview of AWS encryption concepts

Before I walk through the solution overview, I review a few features of AWS KMS and the AWS Encryption SDK.

AWS KMS enables you to control the use of the customer-managed CMKs needed for encryption and decryption using key policies. AWS KMS CMKs never leave KMS unencrypted and are not exportable. In addition, AWS KMS provides an audit trail of key usage in AWS CloudTrail.

The AWS Encryption SDK is a client-side encryption library that enables developers to focus on the core functionality of their application while adhering to security best practices. The AWS Encryption SDK also integrates with KMS.

The encryption flow is as follows:

The AWS Encryption SDK generates a data key using AWS KMS.
The KMS API call returns the plaintext data key and the same data key encrypted under a CMK.
The plaintext data key is used to encrypt the plaintext data and the encrypted data key is stored with the encrypted data.

The CMK is protected by AWS KMS. This encryption strategy of encrypting the data key under another key is known as envelope encryption. It should be noted that the AWS Encryption SDK simultaneously provides confidentiality, data integrity, and authenticity assurances on the data-also known as authenticated encryption.

The AWS Encryption SDK can also perform an additional integrity and authenticity check on the encrypted data by specifying non-secret data, also known as additional authenticated data. This additional authenticated data is specified as an optional set of key-value pairs called the encryption context.

Do not include any sensitive or private information in the encryption context. Be aware that the encryption context is not secret and is not an access-control mechanism. The encryption context is a means of authenticating the data, not the caller. While AWS KMS does not store the encryption context, the AWS Encryption SDK does store the encryption context in plaintext in the encrypted message format. Also, the encryption context is logged in plaintext by AWS CloudTrail.

For example, say an account number (unique value) is stored in the encryption context. Decryption is only successful when the account number is provided as an additional integrity and authentication check to ensure that the encrypted data is untampered, as shown in the following diagram.

Application architecture and data flow

The approach is as follows:

A data key is generated using AWS KMS. The data key is encrypted under multiple CMKs (one for each Region) to provide higher availability for multi-region deployments. I cover the design considerations of a multi-region deployment in the following sections.
The data key then encrypts the plaintext column value to produce the encrypted column value. The encrypted data key is stored with the encrypted column value.
The encrypted message composed of the encrypted column value and encrypted data key is stored in the SQL database. A single data structure is used and follows the AWS Encryption SDK message format.

Multi-region deployment concepts

Some customers deploy their database in a multi-region architecture because their applications scale reads in a different Region or their applications have cross-region disaster recovery requirements.

Key management design with a multi-region database deployment must be carefully considered. Encrypted data under a single Region master key, that then propagates to a different Region, requires a cross-region AWS KMS call to decrypt the data key. This post encrypts the data key under a CMK in each Region

Each Region’s encrypted data key is stored with the encrypted data in the AWS Encryption SDK encrypted message format, thus enabling you to avoid a cross-region AWS KMS call on decrypt. In addition, it is possible to avoid incurring a decryption dependency on a single Region because the data keys are encrypted for each Region.

Encrypting data under multi-region CMKs entails the following workflow. The application makes one KMS request to the local Region and then a subsequent AWS KMS request to each additional Region. Upon decryption, the application calls only to the local Region for decrypting the data key.

Design considerations

There are several considerations regarding space utilization, computational performance, and query restrictions that should be carefully considered before employing client-side encryption under multi-region CMKs.

Database schema design should treat the encrypted data as an unreadable blob that is associated with its specific column data. The AWS Encryption SDK message format adds at least 100 bytes, in addition to the size of the 256-bit AES GCM data keys, encryption context key-value tags, and ciphertext. Be sure to measure and evaluate the impact to the table schema design, CPU utilization, memory usage, disk space, and query response time.
The application must encrypt the data key under a CMK in each Region. This operation entails making a cross-region KMS request per row for each encryption operation. Measure and evaluate your application to determine if the per-row cross-region request latency is acceptable.
Depending on your use case, data key caching may be of limited use, because cached data keys are reused only when their encryption contexts match. For example, it is not possible to use data key caching if a unique value such as account number is used to authenticate encrypted account profile information. However, a unique key-value pair used in the encryption context provides integrity assurances and authenticates the encrypted data. Data key caching is always a trade-off between security and cost (for example, financial cost, latency, and integrity).
The per-row decryption operation makes a local Region API call to AWS KMS for decrypting the data key. Measure and evaluate your application to determine if the per-row data key decryption request latency is acceptable.
Also, avoid extensive reuse of data keys if data key caching is used. Any data key reuse is a compromise between security and cost (such as financial cost and latency cost). Carefully consider the maximum age, maximum number of messages, and maximum number of bytes that a cached data key can encrypt. For an in-depth look at data key caching, see AWS Encryption SDK: How to Decide if Data Key Caching is Right for Your Application.
The encrypted data stored in the table precludes queries that perform WHERE clause predicates on the encrypted column. In addition, indexes created over the ciphertext are not useful at all.
Also be aware of the KMS API requests-per-second quota. AWS maintains Service Quotas for each account and in each Region to help guarantee the availability of AWS resources, as well as to minimize billing risks for new customers. The specific quota of KMS request-per-second is raised through Service Quotas, which provide a self-service centralized management portal for customer AWS quotas.
If the data elements placed into the encryption context are unique values used to look up the row, the encrypted data in the selected row can be authenticated with integrity assurances. Conversely, row integrity cannot be assured if only non-unique values, for example, month/date, are used to authenticate the encrypted data. In the example application, I use the account number as the unique index to look up the row and to authenticate the encrypted data and provide integrity assurances.
Using the encryption context to authenticate encrypted data provides integrity assurances only over the encrypted data. Row level integrity assurances are not provided over the unencrypted columns.

How client-side encryption works

The following steps outline how client-side encryption works.

Specify the KMS CMK ARNs from each Region in which the multi-region application is deployed. The data key is encrypted under the CMK for each Region.
```
master_key_encryption_provider = aws_encryption_sdk.key_providers.kms. (key_ids=[
    key_arn_region_1,
    key_arn_region_2,
])
```
Instantiate the encryption context. The account number is used as the unique index to look up the encrypted row and to authenticate the encrypted data.
```
encryption_context={'account_number':acct.account_number}
```

Encrypt the column data. The data key is encrypted under each Region’s CMK and under the specified encryption context.

ciphertext, encryptor_header = aws_encryption_sdk.encrypt(
    source=acct.pin_number,
    key_provider=master_key_encryption_provider,
    encryption_context=encryption_context
)

Save the encrypted data key and encrypted column together to the same column. The AWS Encryption SDK Message Format is a single data structure that contains both the ciphertext and all encrypted data keys. Further details can be found here.
```
//save the ciphertext using your database programming API
```

How client-side decryption works

The following steps outline how client-side decryption works.

Specify the KMS CMK ARN for the local Region in which the application is running. The AWS Encryption SDK matches the AWS KMS CMK ARN to the appropriate data key and decrypts using the local Region’s CMK.
```
master_key_decryption_provider = aws_encryption_sdk.key_providers.kms.
    KMSMasterKey(key_id=key_arn_local_region )
```

Decrypt the data.

decrypted_plaintext, decrypted_header = aws_encryption_sdk.decrypt(
    source=encrypted_pin_number,
    key_provider=master_key_decryption_provider
)

Instantiate the encryption context. Integrity assurances regarding the encrypted data are provided as the account number is used as the unique index to retrieve the encrypted row value.
```
expected_encryption_context={'account_number':acct.account_number}
```

Verify the encryption context and alert the security operations center if there is a mismatch.

encryption_context_passed=all(
    pair in decrypted_header.encryption_context.items()
    for pair in encryption_context.items()
)   
if not encryption_context_passed:
    cloudwatch.put_metric_alarm("....")

Example application architecture and data flow

To further demonstrate client-side encryption in action, this post includes an example application that performs client-side encryption using AWS Encryption SDK and AWS KMS. A user fills out a form and then submits the form to a web application. The web application performs client-side encryption for the sensitive field. Another form validates and matches the user’s original input. This validation is performed by retrieving the ciphertext from the database, decrypting the value, and then comparing to the user’s input. The example application is only intended to demonstrate client-side encryption functionality and is not intended for production use.

The web application is deployed as a container to AWS Fargate and reads database credentials from AWS Secrets Manager. AWS Fargate is a managed container service that enables you to run containers without having to manage servers or clusters. AWS Secrets Manager helps you protect secrets and to easily rotate and manage secrets such as database credentials.

In this example, AWS CloudFormation takes care of creating, configuring, and provisioning AWS Fargate and Amazon Aurora resources. AWS CloudFormation enables you to describe and manage infrastructure using code in a simple text file. Nested stacks provide dedicated templates for Fargate, Aurora, and Amazon VPC. Aurora is a purpose-built MySQL and PostgreSQL-compatible relational database built for the cloud that combines the performance and availability of traditional enterprise databases, with the simplicity and cost-effectiveness of open source databases.

The following diagram depicts the architecture and data flow of this sample application:

Deploy the AWS CloudFormation solution

The CloudFormation and sample application is available on Github.

Before deploying the CloudFormation solution, you will need to create an AWS CodeCommit repository in the primary and secondary region in order for the AWS CodePipeline to build the Docker image. You can clone the Github repo and push to CodeCommit in each region.

Install the AWS Serverless Application Model (AWS SAM) in order to deploy the serverless application dependencies included in the example application. AWS SAM templates are an extension of AWS CloudFormation templates. Instructions for installing the AWS SAM CLI can be found here.

Also, create an AWS KMS key administrator to administer the CMK. For example, create a role named ‘KeyAdministratorRole’ with the following IAM Policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
                "kms:Update*",
                "kms:Revoke*",
                "kms:Disable*",
                "kms:Get*",
                "kms:Delete*",
                "kms:ScheduleKeyDeletion",
                "kms:CancelKeyDeletion",
                "kms:TagResource",
                "kms:UntagResource"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

To deploy the CloudFormation solution, follow these steps.

1.) Select two Regions, one as primary and the other as secondary.

2.) Deploy the primary Region.

Build and package the AWS CloudFormation nested template. Specify the Amazon S3 bucket. Update the parameters for desired Regions as needed.

sam build --template cfn.iamauthentication.yaml --build-dir ../tmp/iamauthentication-build-dir
sam build --template cfn.fargate.yaml --build-dir ../tmp/fargate-build-dir
sam build --template cfn.client-side-encryption.yaml --build-dir ../tmp/client-side-encryption-build-dir

sam package \
    --s3-bucket "$BUCKET" \
    --output-template-file ../tmp/packaged-cfn.client-side-encryption.yaml \
    --template-file ../tmp/client-side-encryption-build-dir/template.yaml

Input parameter	Input parameter description
SecondaryRegion	Secondary Region that contains the secondary AWS KMS CMK and the Aurora Read Replicas.
KeyAdministratorRole	This is the IAM Role that is managing the CMK. Be sure that the key policy that you create allows the current user to administer the KMS CMK.

Deploy the nested CloudFormation stack. Choose an appropriate name for the stack. Creating the AWS services may take around 30 minutes.

sam deploy \
    --stack-name "$STACK_NAME" \
    --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND \
    --template-file ../tmp/packaged-cfn.client-side-encryption.yaml

3.) Deploy the secondary Region.
Package the nested CloudFormation template. Specify the Amazon S3 bucket. Update the parameters for desired Regions as needed.

sam build --template cfn.iamauthentication.yaml --build-dir ../tmp/iamauthentication-build-dir
sam build --template cfn.client-side-encryption-replica.yaml --build-dir ../tmp/client-side-encryption-replica-build-dir

sam package \
    --s3-bucket "$BUCKET" \
    --output-template-file ../tmp/packaged-cfn.client-side-encryption-replica.yaml \
    --template-file ../tmp/client-side-encryption-replica-build-dir/template.yaml

Input parameter	Input parameter description
PrimaryRegion	Primary Region that contains the primary KMS CMK and the primary Aurora cluster for reads and writes.
KeyAdministratorRole	This is the IAM Role that is managing the CMK. Be sure that the key policy you create allows the current user to administer the KMS CMK.
SourceDBInstanceIdentifier	Enter the primary Region Aurora cluster ARN. This value is located in the CloudFormation outputs under the database stack value AuroraClusterArn. An Aurora cross-region Read Replica is created in the secondary Region.

Create the nested AWS CloudFormation stack. Creating the AWS services may take around 30 minutes.

sam deploy \
    --stack-name "$STACK_NAME" \
    --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND \
    --template-file ../tmp/packaged-cfn.client-side-encryption-replica.yaml

Test the sample application

Use the following steps to test the sample application with client-side encryption.

Open the primary Region sample application in your web browser. In the CloudFormation stack, choose Outputs, CreateURL. For example, https://example.execute-api.PRIMARY_REGION.amazonaws.com/create. It may take several minutes for AWS CodePipeline to build and deploy the container image from the previous step, during which time you may not be able to access the sample application.
Fill out and submit the HTML form on the page:

Complete the two form fields: Account Number and User Id.
Choose OK.

Open the secondary Region sample application in your web browser. In the CloudFormation stack, choose Outputs, Authenticate. For example, https://example.execute-api.PRIMARY_REGION.amazonaws.com/authenticate. It may take several minutes for AWS CodePipeline to build and deploy the container image from the previous step, during which time you may not be able to access the sample application.
Fill out and submit the HTML form on the page:

Complete the two form fields: Account Number and UserId.
Choose OK.

Summary

In this post, I walked through how to enable client-side encryption with the AWS Encryption SDK backed by AWS Key Management Services (KMS) for Amazon Relational Database Services (RDS) and Amazon Aurora. This client-side encryption approach can provide tighter security controls when you must prevent unauthorized access to columnar plaintext. I showed how the approach supports a multi-region deployment. In addition, encryption keys are protected using AWS KMS, enabling you to have control of the keys needed for decryption (using KMS).

About the Author

Josh Joy is a Security Transformation Consultant with AWS Professional Services helping to provide customers with a secure journey to AWS. Josh helps customers improve their security posture as they migrate their most sensitive workloads to AWS. Josh enjoys diving deep and working backwards in order to help customers achieve positive outcomes.