Understanding Amazon DynamoDB encryption by using AWS Key Management Service and analysis of API calls with Amazon Athena
As applications evolve to be more scalable for the web, customers are adopting flexible data structures and database engines for their use cases. Using NoSQL data stores has become increasing popular because of NoSQL’s flexible data model for building modern applications. Amazon DynamoDB is a fast and flexible NoSQL database service that can provide consistent single-digit millisecond latency at scale. As you adopt DynamoDB for web scale workloads, it’s important that you understand security controls available within DynamoDB.
You can use various capabilities to run DynamoDB securely. Amazon VPC endpoints provide secure access to DynamoDB tables for applications running in a VPC. Amazon VPC endpoints also provide fine-grained access control through AWS Identity and Access Management (IAM) to regulate access to items and attributes stored in DynamoDB tables. You can also work with Transport Layer Security (TLS) endpoints for encryption of data in transit.
For encryption of data at rest, you can choose one of two customer master key (CMK) options to encrypt your tables. The AWS-owned CMK is the default encryption type, where the key is owned by AWS as a collection of CMKs and manages use in multiple AWS accounts. AWS-owned CMKs are not in your AWS account. On the other hand, AWS-managed CMKs are keys stored in your account that are created, managed, and used on your behalf by an AWS service that integrates with AWS Key Management Service (AWS KMS).
Server-side encryption at rest using the AWS-owned CMK is enabled by default on all DynamoDB tables. DynamoDB encrypts all existing tables that were previously unencrypted using the AWS-owned CMK. However, you can select an option to encrypt some or all of your tables by using an AWS-managed CMK. In addition, you can use client-side encryption to protect data before sending it to DynamoDB.
In this blog post, we cover the mechanics of server-side encryption by using an AWS-managed CMK. We also discuss tracking API calls to AWS KMS by using AWS CloudTrail and Amazon Athena to understand the distribution of calls made (GenerateGrant vs. Decrypt).
Create a DynamoDB table
Let’s begin by creating a DynamoDB table with the AWS-managed CMK. The attribute
–sse-specification Enabled with AWS KMS as
SSEType defines the method of encryption. In this case, it is an AWS-managed CMK.
Reviewing the following response output from the AWS CLI command,
SSEDescription Status is set to
KMS with the Amazon Resource Name (ARN) of the KMS key used for server-side encryption.
Note: If you don’t see
SSEDescription in the response for a table with server-side encryption, try updating to the latest AWS CLI.
Verify encryption for the table
If you want to verify a table’s encryption method, you can use the
describe-table API call or the DynamoDB console.
You can use the
--query parameter to filter out and print only necessary attributes in the response output, as follows. You can see that the table is
ACTIVE and the status attribute in the
SSEDescription object is
ENABLED with AWS KMS as
How server-side encryption works
Now that we know the ratings table is created with AWS KMS server-side encryption, let’s look at the workflow for server-side encryption.
These are the steps in the server-side encryption process, as shown in the preceding diagram:
- The owner of the table uses the
CreateTableAPI call with server-side encryption set to AWS KMS.
- When the
CreateTableAPI request is received, DynamoDB authenticates the request.
- DynamoDB uses the AWS-managed CMK as the top-level key. Because DynamoDB has to use this key for server-side encryption, the first step is to make a set of
- DynamoDB uses the CMK to generate a table key, which is a unique key for each table. This table key is used to generate data encryption keys that are used to encrypt underlying structures in the table.
- The plaintext key material and the encrypted key material are sent to DynamoDB.
- The plaintext table key is cached in DynamoDB.
The following diagram shows the hierarchy of server-side encryption keys used by DynamoDB. DynamoDB uses the AWS KMS-managed CMK in each AWS Region in your AWS account as the top-level key to generate and encrypt a unique table key for each table. DynamoDB uses the table key to generate data encryption keys and then uses the data encryption keys to encrypt table data and the underlying structures in a table.
Now that we have created the table, let’s look at the mechanics while using the
PutItem API call.
When using the
PutItem API call:
- The user issues a
PutItemcall to add data to a DynamoDB table.
- DynamoDB authenticates the user’s request.
- DynamoDB verifies that the user has the necessary permissions to write data to the DynamoDB table
- Depending on the data being encrypted, DynamoDB identifies the right data encryption key to encrypt the data. To avoid having DynamoDB call KMS for every DynamoDB operation, the table key is cached for each principal in memory. The table key is refreshed once every five minutes per client connection with active traffic. If DynamoDB gets a request for the cached table key after five minutes of inactivity, it sends a new request to KMS to decrypt the table.
- Encrypted data and encrypted key material are stored in DynamoDB.
Now that we have inserted data into the DynamoDB table, let’s look at the mechanics of retrieving the data with the
GetItem API call.
When using the
GetItem API call:
- The user issues a
GetItemcall to retrieve data from the DynamoDB table.
- DynamoDB authenticates the user request.
- DynamoDB verifies that the user has the necessary permissions to read data from the DynamoDB table.
- The request for retrieving the data is made.
- Encrypted data is retrieved.
- DynamoDB caches the plaintext table keys for each principal in memory. If DynamoDB gets a request for the cached table key after five minutes of inactivity, it sends a new request to AWS KMS to decrypt the table key.
- Decrypted plaintext key material is retrieved.
- Data is decrypted by using received plaintext key material.
- Plaintext data is sent to the user by using HTTPS (for the TLS endpoint only).
Note: CloudTrail logs are necessary for the next section. Ensure that CloudTrail is enabled on your account. For more information, see Getting Started with CloudTrail.
Analyze KMS key usage using CloudTrail logs and Athena
CloudTrail records API calls and publishes log files to Amazon S3. Account activity is tracked as an event in the CloudTrail log file. Each event contains information such as who performed the action, the date and time of the action, and the resources affected. Multiple events are stitched together and structured in JSON format in the CloudTrail log files. When DynamoDB makes API calls to create a grant on the CMK, they are recorded by CloudTrail. In addition, when DynamoDB makes an API call to generate a table key or API calls to decrypt, they are recorded by CloudTrail. In this post, we use Athena, an interactive SQL query service, to analyze CloudTrail logs stored on Amazon S3 to understand calls made to AWS KMS and DynamoDB.
The following sample queries list calls made to DynamoDB tables for a date range, the number of calls to AWS KMS, and distribution by API call type. Before we can run queries, though, we need to create an external table in Athena that describes the structure of CloudTrail logs.
Create a table in Athena
Use the following
CREATE EXTERNAL TABLE command in the Athena console to create the table. Replace the Amazon S3 bucket name and location with your Amazon S3 bucket name and location.
As shown in the preceding code example, CloudTrail logs are delivered to Amazon S3 in this format:
s3://<your-cloudtrail-s3-bucket>/AWSLogs/<AWS account-number>/CloudTrail/region/year/month/date. The Athena external table that we have created is partitioned by year, month, and day as specified in
PARTITIONED BY syntax in the preceding code example.
The next step is to add partitions to the table with the following command. You can execute the commands directly in the Athena console. Here, I add the partition with
day=’11’. Partitioning your data can restrict the amount of data scanned by each query and thus improve performance and reduce cost. For more information about partitioning data in Athena, see Partitioning Data in the Athena User Guide.
Now that we have created the external partitioned table in Athena and added partitioned data, let’s execute a few queries.
In the following example queries, I use the partition data of
day=’11’, as per the preceding example
ALTER command. Change these values based on your Athena partition data.
Example query 1: This query returns API calls made to DynamoDB for the specified date from the CloudTrail logs table. It does this by filtering on
eventsource = ‘dynamodb.amazonaws.com’. It limits the number of records returned to 1,000.
Example query 2: In the previous query, we retrieved all available attributes that have the
eventsource = “dynamodb.amazonaws.com”. Now, let’s further filter the output by specifying select columns and attributes with API calls made to event source
dynamodb.amazonaws.com. The output should show calls made to KMS from DynamoDB.
Example query 3: Let’s review API calls made to KMS. The following query helps identify the set of API calls made to a specific table. Replace
your-table-name with the name of the DynamoDB table you want to query. You can order the results by
eventtime to understand a timeline of API calls made to AWS KMS. You should see
decrypt eventnames in the Athena output.
Example query 4: Every AWS Region has a unique KMS CMK that is used to generate table keys. This query helps you identify tables that are using a specific key for server-side encryption at rest. Replace
arn:aws:kms:your-region:your-account-number:key/your-key-id with the ARN of the KMS CMK in the AWS Region in which you are interested. Athena output should show
eventname and DynamoDB tables that are using the KMS key.
In this blog post, we outlined encryption options with DynamoDB and walked through the process of creating DynamoDB tables with server-side encryption using the AWS-managed CMK. We reviewed DynamoDB API workflows and KMS interaction when creating a table, adding an item to a table, and retrieving an item from a DynamoDB table with encryption enabled. We also looked at the hierarchy of encryption keys used with DynamoDB. We then used Athena to analyze CloudTrail logs to retrieve relevant information. This information includes KMS API call activity with DynamoDB tables, numbers and types of API calls, and mapping of service keys to DynamoDB tables. All together, this should give you further insights into DynamoDB encryption and its interaction with AWS KMS.
About the Authors
Sai Sriparasa is a Sr. Big Data & Security Consultant with AWS Professional Services. He works with our customers to provide strategic and tactical big data solutions with an emphasis on automation, operations, governance & security on AWS. In his spare time, he follows sports and current affairs.
Prahlad Rao is a Solutions Architect with AWS and focused on databases and bigdata. He works with enterprise customers to help navigate their cloud journey and optimize applications for the cloud