Use Amazon DynamoDB Accelerator (DAX) from AWS Lambda to increase performance while reducing costs
April 01, 2020 update: Changed the security to add a least privileged IAM policy to the role instead of a wide open managed policy, switched to HttpApi in API Gateway for auto-deployment as well as cost, and added to node.js code to detect if a requesting client is base64 encoding the body of the request and decode if so.
Using Amazon DynamoDB Accelerator (DAX) from AWS Lambda has several benefits for serverless applications that also use Amazon DynamoDB. DAX can improve the response time of your application by dramatically reducing read latency, as compared to using DynamoDB. Using DAX can also lower the cost of DynamoDB by reducing the amount of provisioned read throughput needed for read-heavy applications. For serverless applications, DAX provides an additional benefit: Lower latency results in shorter Lambda execution times, which means lower costs.
Connecting to a DAX cluster from Lambda functions requires some special configuration. In this post, I show an example URL-shortening application based on the AWS Serverless Application Model (AWS SAM). The application uses Amazon API Gateway, Lambda, DynamoDB, DAX, and AWS CloudFormation to demonstrate how to access DAX from Lambda.
A simple serverless URL shortener
The example application in this post is a simple URL shortener. I use AWS SAM templates to simplify the setup for API Gateway, Lambda, and DynamoDB. The entire configuration is presented in an AWS CloudFormation template for repeatable deployments. The sections that create the DAX cluster, roles, security groups, and subnet groups do not depend on the SAM templates, and you can use them with regular AWS CloudFormation templates.
Like all AWS services, DAX was designed with security as a primary consideration. As a result, it requires clients to connect to DAX clusters as part of a virtual private cloud (VPC), which means that you can’t access a DAX cluster directly over the internet. Therefore, you must attach any Lambda function that needs to access a DAX cluster to a VPC that can access the cluster. The AWS CloudFormation template in the following section contains all the necessary pieces and configuration to make DAX and Lambda work together. You can customize the template to fit the needs of your application.
The following diagram illustrates this solution.
As illustrated in the diagram:Amazon DynamoDB Accelerator (DAX) from AWS Lambda to increase performa
- The client sends an HTTP request to API Gateway.
- API Gateway forwards the request to the appropriate Lambda functions.
- The Lambda functions are run inside your VPC, which allows them to access VPC resources such as your DAX cluster.
- The DAX cluster is also inside your VPC, which means it can be reached by the Lambda functions.
The AWS CloudFormation template
Let’s start with the AWS CloudFormation template (
template.yaml). The first section of the code contains the AWS CloudFormation template, AWS SAM prologue, and AWS SAM function definition.
This section of the template specifies the following:
- Location of the code package
- Environment variables used by the function
- URL formats
- Security policies
- Language runtime
- VPC configuration (in the
VpcConfigstanza), which allows the Lambda function to reach a DAX cluster
- Role, which is defined later on, but this calls it to be created. It gives only the access necessary for this project to run
This example creates its VPC and subnets so that they are defined using references to later sections of the file. If the VPC already exists, you should use the existing identifiers instead.
AWS::Serverless::Function takes care of creating the Lambda function definition with the appropriate permissions in addition to creating an API Gateway endpoint that calls the Lambda function on each HTTP request. Users access the URL shortener through this endpoint.
The next section of this code example creates a DynamoDB table.
This table has only a single hash key (
KeySchema has only the
id column). The
ProvisionedThroughput ReadCapacityUnits are kept low because DAX serves most of the read traffic. DynamoDB is called only if DAX has not cached the item.
Now the template specifies the DAX cluster.
The cluster is created using a single
dax.t2.small node for demonstration purposes. Production workloads should use a cluster size (
ReplicationFactor) of at least 3 for redundancy and consider using an appropriately-sized
dax.r4.* instance (
getUrlRole stanza defines an AWS Identity and Access Management (IAM) role and policy that grants the DAX cluster permission to access your DynamoDB data, but also be useable by Lambda Function. (Don’t edit or remove this role after creating it, or the cluster won’t be able to access DynamoDB.)
Next, the template sets up a security group with a rule to allow Lambda to send traffic to DAX on TCP port 8111. If you look earlier in this post at the serverless function definition, the
VpcConfig stanza refers to this security group. Security groups control how network traffic is allowed to flow in a VPC.
This part of the template creates a new VPC and adds a subnet to it in the first available Availability Zone of the current AWS Region, and then it creates a DAX subnet group for that subnet. DAX uses the subnets in a subnet group to determine how to distribute the cluster nodes. For production use, it is highly recommended that you use multiple nodes in multiple Availability Zones for redundancy. Each Availability Zone requires its own subnet to be created and added to the subnet group.
I present the URL-shortening code in a single file (
lambda/index.js) for simplicity. How the code works: A
POST request takes the URL, creates a hash of it, stores the hash in DynamoDB, and returns the hash. A
GET request to that hash looks up the URL in DynamoDB and redirects to the actual URL. The full code example is available on GitHub.
The Lambda handler uses environment variables for configuration:
DDB_TABLE is the name of the table containing the URL information, and
DAX_ENDPOINT is the cluster endpoint. In this example, these variables are configured automatically in the AWS CloudFormation template.
dynamodb instance is at global scope so that it persists between function executions. It is initialized on the first run and continues to exist as long as the underlying Lambda instance exists. As a result, you don’t have to reconnect on every execution, which can be an expensive operation when using DAX. By reusing the
dynamodb instance for both direct DynamoDB access and DAX access, the code also shows that the DynamoDB and DAX clients are source-compatible, except for the initialization code.
Some clients, e.g. curl, may send the body of the request base64 encoded. If that happens, we detect and decode the body on the way into plain text to be written to DynamoDB.
The last piece that is needed is a
You package Lambda functions as .zip files for deployment. For this example, the .zip archive must contain the
lambda directory (for the example code) and the
node_modules directory (for the dependencies) so that Lambda has everything it needs to run the function. Run all the following commands from a Bash shell.
This code creates
geturl.zip, which is the Lambda package. Now you need an Amazon S3 bucket to put the package in so that AWS CloudFormation can find it.
Then, create an AWS CloudFormation package of the code in that bucket.
Finally, deploy the AWS CloudFormation stack to create all the resources.
Using the URL shortener
You can now access the URL shortener by using the API Gateway endpoint that was created by the AWS CloudFormation template. The URLs created by API Gateway contain a
REST ID that is specific to each endpoint. You can find the ID for the example endpoint using the AWS CLI.
To shorten a URL, use the following command.
This command returns a “slug” that you can use to go to the URL.
You also can create a custom URL by using Amazon Route 53.
In this post, we showed how to use AWS CloudFormation to create a Lambda function that uses DAX and DynamoDB to implement a simple URL shortener. The AWS CloudFormation template includes all the configuration necessary to ensure that the Lambda function can reach the DAX cluster and use it to access the data in DynamoDB.
By combining the high performance of DAX with your serverless Lambda applications, you can both increase your performance while reducing your costs which is a win for you and your customers.
About the Authors
Kirk Kirkconnell is a Senior Technologist on Amazon DynamoDB and Amazon Managed Apache Cassandra Service with Amazon Web Services.
Jeff Hardy was a Software Development Engineer at Amazon Web Services.