Caching for performance with Amazon DocumentDB and Amazon ElastiCache
In tech, caching is ubiquitous. CPUs use L1, L2, and L3 caches, and mobile phones cache app data locally. Streaming services cache content on the edge, browsers cache images, and more.
The same is true for databases.
Imagine if, on a gaming site, every time a leaderboard was displayed, a query had to do a summation and sort all of the players of a game. Or if, every time you went to an ecommerce site, the price for a particular product had to be read from disk for each customer. The performance would be unacceptable and the amount of compute would be cost-prohibitive.
In databases, two of the main motivations for caching are performance and cost savings. Either you want microsecond performance where milliseconds do not suffice, or you want to offload expensive operations from the database by caching commonly used data.
In this post, I show you how to integrate Amazon DocumentDB (with MongoDB compatibility) and Amazon ElastiCache to achieve microsecond response times and reduce your overall cost. The following diagram shows the architecture for the solutions in this post.
The operational database in this example is Amazon DocumentDB—a fast, reliable, and fully managed database service that makes it easy to set up, operate, and scale MongoDB-compatible databases in the cloud. With Amazon DocumentDB, you can run the same application code and use the same drivers and tools that you use with MongoDB.
Using Amazon DocumentDB’s flexible document model, data types, and indexing, you can store and query content quickly and intuitively. For example, user reviews and demo videos for shopping sites and catalogs, inventory lists for point-of-sale terminals, and financial trades for trading platforms.
For the caching layer, use Amazon ElastiCache, which makes it easy to set up, manage, and scale distributed in-memory cache environments in the AWS. ElastiCache provides a high performance, resizable, and cost-effective in-memory cache, while removing complexity associated with deploying and managing a distributed cache environment. ElastiCache is compatible with both the Redis and Memcached engines.
I demonstrate how to integrate these two services by building an application that allows users to find their favorite song. They submit the song title using a REST API client to the application engine.
The application engine processes the API request by retrieving the document containing the singer’s name and lyrics of the requested song from the ElastiCache layer. If there has been a prior request for that song already, the read is served by ElastiCache. If not, the application engine queries Amazon DocumentDB and returns the requested document to the application as a JSON document.
The application caches a copy in ElastiCache to speed up the response time of any subsequent request for the same song. For this example, I use Amazon ElastiCache for Redis as the caching layer and Postman as the REST API client. Postman app is an open source tool for testing REST API.
Create an Amazon DocumentDB cluster
For more information about creating a cluster, see Getting Started:
- Open the Amazon DocumentDB console.
- Create a new cluster.
- Enter the cluster identifier.
- Choose the appropriate instance.
- Leave the default value for the number of instances.
- Define the master user name and password.
- Select Create cluster, as shown in the following screenshots.
Create the ElastiCache for Redis cluster
Create the Amazon ElastiCache for Redis cluster using the following steps, as shown in the following screenshot.
- From the AWS Management Console, search for ElastiCache under Services.
- From the ElastiCache dashboard, select Redis and choose Create.
- Fill out the Redis settings. For this example, use the default port 6379.
Create an EC2 instance
I host the song application on an Amazon EC2 instance.
- Create an EC2 Linux instance. Ensure that it has a public IP address.
- Launch the instance with a key pair.
Download your key pair file (.pem), which stores the private keys associated with your newly created instance, and connect to it using the following command:
For example, if you named your key pair file my-key-pair.pem and your EC2 instance DNS is ec2-198-51-100-1.compute-1.amazonaws.com, the command would be:
Connect to the clusters
To connect to your Amazon DocumentDB and ElastiCache clusters, update the security groups for the two clusters to allow inbound traffic for TCP ports 27017 and 6379, respectively.
Also enable inbound connections on the security group for the EC2 instance on TCP port 8082, which the demo application is listening on as shown in the following screenshot.
Install the MongoDB shell
Verify the connection
Verify that you’re able to connect to the Amazon DocumentDB cluster from the EC2 instance, using the following command:
Now, run the following keys * command to see what is currently in your cache.
The output confirms that the cache is empty.
Build the app engine
Now that you can successfully connect to both the Amazon DocumentDB database and ElastiCache, you can start building the Node.js app engine on a separate EC2 instance.
Use the same Node.js application running on the EC2 instance to populate the Amazon DocumentDB cluster with data containing singer, title, and text lyrics details.
First, ensure that Node.js is installed on the EC2 instance.
After Node.js is installed on the EC2 instance, check that Node Package Manager (npm) is installed by running the following commands:
Create an application directory
Next, create a directory for the application and change to that directory using the commands below:
The following command generates the package.json file:
You may select the default index.js value, but for main, enter cdstore.js instead of index.js.
Next, install all the dependencies needed for the application to work. These include MongoDB driver that allows the application to connect to Amazon DocumentDB, a Node.js web application framework, a Node.js Redis client, and a body-parser that is Node.js body parsing middleware. Install them by running the following commands:
The contents of the package.json file are as follows:
Create the functions
Create two functions:
SaveSong () function used for sending a /POST request to insert data into the Amazon DocumentDB instance. You could also write a script to insert data in bulk using the command db.
SearchSongByTitle() is used by the
/GET method to perform the actual search.
Store the two functions in a file called cache.js. Create that file:
Use your favorite editor (vim, vi, etc.) and copy the following code to paste into the cache.js file:
Create the endpoint
Now, create the endpoint cdstore.js as follows:
Copy the code below and paste into the cdstore.js file:
Start the application
Next, start the application, which, in this example, is set to run on port 8082 on the EC2 instance. To do that, use the following command:
If everything is running correctly, you should see the following message on the console:
Test the application
To test the application, use Postman (a REST API client) to make
GET requests. For
GET requests, compare response times between the first request (when data is fetched from Amazon DocumentDB) and subsequent requests (when it is served from Redis ElastiCache).
- Populate the Amazon DocumentDB with the song dataset using the
- Search for with a
For simplicity, this example shows only the process for entering a single song, though it has been repeated several times for subsequent steps.
Open Postman and select the
POSTmethod then enter the application URL (in this example, http://<ec2-dns-or-IP>:8082/).
Ensure x-www-form-urlencoded is selected under Body.
- Enter the song details as key-value pairs:
Title: Everything is everything
Singer: Lauryn Hill
Text: After winter must come spring
- Choose Send.
You should see Saved and Status 201 created on the Postman screen, as shown in the following screenshot:
Check the details of the
POST request from the Postman console and ensure that you have the same result, as shown in the following screenshot:
Checking the Amazon DocumentDB cluster
Use the same command from the previous step to connect to Amazon DocumentDB and check that the data has been saved into the database:
- show db—Lists the available database instances in Amazon DocumentDB.
- use cd—Instructs the interpreter to use the instance called cd.
- show collections—Displays the available collections (which you could think of as tables equivalent in relational DB)
- db.text.find()—Lists all the documents inside the text collection
Bulk insert data
The output from the command db.text.find() also shows that there were a few song entries in the Amazon DocumentDB instance already. They were added by sending a
POST command from Postman while testing the application.
It’s also possible to bulk insert songs to Amazon DocumentDB using the following command:
Retrieve an entry using only the title
GET request from Postman.
Go back to Postman screen and select the
GET method with the following URL:
Enter the path variable key value to try to retrieve the song previously saved in Amazon DocumentDB using only the title of the song (Everything is everything).
You should receive details of the Lauryn Hill song in JSON format, as shown in the following screenshot:
From the Postman console, you can see the actual
GET request details, as shown in the following screenshot:
Overall, the request was served in 13 milliseconds.
Check that the data is saved in ElastiCache
The Node.js application is designed (as in the following code extract) to send each client request to Redis to check if the requested document has already been cached. If not, the request is then sent to Amazon DocumentDB to retrieve the requested document, which is sent back to the client with a copy saved in Amazon ElastiCache.
Check that the data is saved in Amazon ElastiCache.
Run the keys * command to list all the keys and see the key for the song’s title, Everything is everything, from the singer Lauryn Hill. You sent it as a
POST request in the previous step. It’s now cached in the Amazon ElastiCache instance as the ninth entry in the list.
You can run the
get <key> command to see details of that particular key. For instance, to see details for the song Everything is everything, run the command get Everything is everything, as shown in the following code example:
Having confirmed that the data has been saved in Amazon ElastiCache, run another
GET request to fetch the cached data, which should be delivered much faster.
As you can see from the following screenshot, the response time for the
GET request was reduced to 6 milliseconds, which clearly demonstrates the performance improvement provided by Redis ElastiCache.
In this post, I demonstrated the integration between Amazon DocumentDB and Amazon ElastiCache using an application that enables users to find their favorite song based on the title provided. I also demonstrated the use of a caching layer like Amazon ElastiCache for Redis in front of Amazon DocumentDB to improve request response times for data stored in Amazon DocumentDB. It also potentially reduces costs by reducing serving data from a cache versus having to run a larger database cluster.
About the Author
Georges Leschener is a Sr. Partner Solutions Architect in the Global System Integrator (GSI) team at Amazon Web Services. He works with our GSIs partners to help migrate customers’s workloads to AWS cloud, design and architect innovative solutions on AWS by applying our best practices.