AWS Startups Blog

How to Build Dynamic Dashboards Using AWS Lambda and Amazon DynamoDB Streams: Part II

By Jeff Nunn, Solutions Architect, AWS


In part one of this series, I showed you the results of combining AWS Lambda, a new service from AWS, and Amazon DynamoDB Streams, a new feature of DynamoDB. If you recall, DynamoDB Streams allows you to track changes made to DynamoDB tables, and receive a stream of update records with a single API call. This allows your applications to respond to high-velocity data changes without having to track the changes yourself.

When associated with an AWS Lambda function, DynamoDB Streams provide a way to programmatically drive real-time events, like the analytics seen in the fictitious Vote App chart that we built in part one. In this follow-up, I’ll go in depth to show you how to recreate the process so you can adapt a similar solution to your own needs.

Note
In part one, AWS Lambda and DynamoDB Streams were in preview mode. Since their recent launch, some portions of the API have changed. This post reflects the latest updated coding conventions of each.

Overview

By the end of this post, you will be able to do the following:

  • Set up a free phone number at Twilio to receive votes via text message
  • Create DynamoDB tables to hold votes
  • Write a web server in Node.js to process incoming votes
  • Use DynamoDB Streams to notify a Lambda function and aggregate cumulative votes
  • Use JavaScript and HTML to create a single page app that uses the AWS SDK for JavaScript to query DynamoDB and display the data in a dynamic chart

The following diagram shows the architecture for this process.

AWS Lambda DynamoDB architecture

To recap, the diagram shows this process:

  1. A user sends a text message.
  2. The message is received by an auto-scaled EC2 instance running Node.js, sitting behind an Elastic Load Balancer (ELB) load balancer.
  3. Node.js makes updates to DynamoDB.
  4. AWS Lambda, listening to a DynamoDB Stream, tallies the votes and allows a dynamic dashboard to retrieve the results.

Before you get started, you will need at a minimum an EC2 instance to hold the Node.js code, or the address of an elastic load balancer sitting in front of auto-scaling EC2 instances. Auto Scaling allows you to automatically increase the number of EC2 instances as user demand goes up, and decrease the number of EC2 instances as demand goes down.

Whether or not you use a single instance or multiple instances behind an ELB load balancer depends on your requirements. In either case, you should open a port for web traffic (port 80) and give yourself the ability to SSH into your instance to add the Node.js code.

To set up a Twilio account

In our voting application, a user votes by texting to a number that is set up at Twilio. To set up your Twilio account, perform the following steps.

1. Go to Twilio, create an account, and then go to the Manage Numbers dashboard.
2. Choose your assigned number.
3. As shown in the following screenshot, open the Messaging section.

twilio account settings

4. For Configure with, select the URL option.
5. For Request URL, type the public address of your ELB load balancer or instance, and choose HTTP POST for the method by which Twilio will send your traffic.

DynamoDB Tables

To tally your responses, you need to create two DynamoDB tables: one to track the incoming votes (we’ll call this the VoteApp table), and another to aggregate the vote totals into a final count (we’ll call this the VoteAppAggregates table). As you’ll see, you create these tables to make efficient use of throughput and partitioning within your tables.

To begin creating a VoteApp table
Perform the following steps to get started with creating a DynamoDB table. The table will track incoming votes:

  1. Open the DynamoDB console, and choose Create Table.
  2. For Table Name, type a name for your table.
  3. For Primary Key Type, select the Hash option.
  4. For Hash Attribute Name, select the String option, and type VotedFor in the box.
  5. Choose Continue.

creating dynamoDB table

6. On the optional Add Indexes page, choose Continue. You don’t need an index for this demo.

In the next section, you provision your throughput capacity.

Provisioning Throughput Capacity

DynamoDB allows you to store and access data at any scale by provisioning throughput capacity for reads and writes. It is important to remember that when designing your DynamoDB tables, you should design them for uniform data access across the items in your tables. When it stores data, DynamoDB divides a table’s items into multiple partitions, and distributes the data primarily based upon the hash key element. As such, to achieve the full amount of throughput, you need to keep your workload spread evenly across the hash key values.

Because we have only three possible vote values, we run the risk of having a heavily used hash key element, where the bulk of our request traffic might be heavily concentrated on one partition. To get the most out of our throughput, we need to create tables where the hash key element has a large number of distinct values. The more possible values we have, the more likely requests will be spread across the partitioned space in a way that best utilizes our allocated throughput level.

To accomplish this, we randomize the writes across multiple hash key values. When we record the user vote in our Node.js code, we choose a random number from 1 to 10 and append it to the end of the vote. For example, a vote for “RED” might be recorded as “RED.3” or “RED.8”.

Having 30 possible hash key values now instead of three helps us to better protect our application from unexpected throttling as a result of partitioning, and allows us to take advantage of our application’s read and write throughput requirements.

Read Capacity

A unit of read capacity in DynamoDB represents one strongly consistent read per second for items as large as 4 KB. Because the individual size of each item in the database is less than 4 KB, you will not need to exceed more than 1 unit of read capacity per second, so you can leave that value at its default of “1”.

Write Capacity

For writes, there are two main considerations:

1. How large is the item you are writing?

Write capacity is based on the total size of the item, multiplied by the number of writes per second. If the item is less than 1 KB in size, it will consume 1 write capacity unit. If the item exceeds 1 KB in size, it is rounded up to the nearest whole number. In this application, an item consisting of a color (up to 8 characters for “GREEN.10”) is well under a 1 KB size limit, so your item will consume only 1 write capacity unit per second.

2. How many writes per second do you need?

If you anticipate your application will receive 10 texts per second, you would take your write capacity unit size (“1” in this example) and multiple it by the number of writes per second that you anticipate. For 10 writes per second, you would need to configure your write capacity units at “10”. Similarly, if you anticipate your application will receive 100 texts per second, you would need to configure your write capacity units at “100” (1 write capacity unit x 100 writes per second).

To provision throughput capacity

At this point in the demo, you should see the Provisioned Throughout Capacity page of the DynamoDB console. The page includes a handy calculator to help you estimate your read and write capacity unit settings.

  1. Select the Help me calculate how much throughout capacity I need to provision check box.
  2. Set values appropriate to your application (see examples in the following screenshot). DynamoDB makes it easy to update your throughput capacity as your needs change.

changing throughput capacity in DynamoDB

3. Choose Continue.

To finish creating a VoteApp table
Perform the following steps to finish creating your table:

  1. On the Additional Options page, select the Enable Streams check box.
  2. For View Type, choose New Image. The view type specifies what will be written to the update stream when an item in a table is created or modified. You want the new image so that you can see both the key (the color voted for and the random suffix, like “GREEN.10”) and the total number of votes for that particular key.
  3. Select the Use Basic Alarms check box, and then set a threshold at which you should be notified if you exceed the capacity you have set for your table.
  4. For send notification to, type the email account that will receive your notifications.
  5. Review your alarm settings, and then choose Continue.
  6. On the Review page, review your specifications for the table, and then choose Create to create your DynamoDB table.

Aggregates Table

Now it’s time to create your second table. Similar to the VoteApp table, you create a VoteAppAggregates table with a hash primary key, give it a hash attribute name of VotedFor, and then choose Continue.

You won’t need a secondary index, so choose Continue and set your throughput capacity. For this demo, you will read from this table once every few seconds, so a read capacity of “1” is appropriate. The VoteApp table will use DynamoDB Streams to aggregate the votes and write to this aggregation table, so your write capacity can be set lower than that of the VoteApp table. For now, set it to “2,” and then choose Continue. Streams are not required on your aggregation table, so uncheck the Enable Streams check box on the next page, and then choose Continue to create your table.

Response Server

Now that our DynamoDB tables are set up to store votes, we need a web server to handle incoming texts containing those votes. The Node.js code for your instances should handle POST requests from Twilio, and optionally GET requests for other traffic that might come in. I use the Express framework to handle POSTs and GETs, and to ensure that the incoming request contains a valid vote. If the vote is valid, use the UpdateItem action to update the DynamoDB table, as shown in the following code example:

var AWS = require('aws-sdk');
var express = require('express');
var app = express();
var twilio = require('twilio');
var qs = require('querystring');
/* GET requests */
app.get('/', function (req, res) {
  res.send('VoteApp demo');
});
/* POST requests to handle Twilio traffic */
app.post(‘/’, function (req, res) {
  var body = ‘’;
  req.on(‘data’, function (data) {
      body += data;
  });
  req.on(‘end’, function () {
    var POST = qs.parse(body);
    var dynamodb = new AWS.DynamoDB({apiVersion: ‘2012–08–10’, region: ‘us-east-1’});
    /* Make sure we have a valid vote (one of [RED, GREEN, BLUE]) */
    var votedFor = POST[‘Body’].toUpperCase().trim();
    if ([‘RED’, ‘GREEN’, ‘BLUE’].indexOf(votedFor) >= 0) {
       /* Add randomness to our value to help spread across partitions */
    votedForHash = votedFor + “.” + Math.floor((Math.random() * 10) + 1).toString();
    /* …updateItem into our DynamoDB database */
    var tableName = ‘VoteApp’;
    dynamodb.updateItem({
      ‘TableName’: tableName,
      ‘Key’: { ‘VotedFor’ : { ‘S’: votedForHash }},
      'UpdateExpression': 'add #vote :x',
      'ExpressionAttributeNames': {'#vote' : 'Votes'},
      'ExpressionAttributeValues': { ':x' : { "N" : "1" } }
    }, function(err, data) {
      if (err) {
        console.log(err);
      } else {
        var resp = new twilio.TwimlResponse();
        res.writeHead(200, { 'Content-Type':'text/xml' });
        resp.message("Thank you for casting a vote for " + votedFor);
        res.end(resp.toString());
        console.log("Vote received for %s", votedFor);
      }
    });
   } else {
      console.log("Invalid vote received (%s)", votedFor);
   }
 });
});
var server = app.listen(8080, function () {
  var host = server.address().address;
  var port = server.address().port;
  console.log('VoteApp listening on port %s', port);
});

In the preceding code, we take advantage of a very useful feature of DynamoDB, the update expression. An update expression specifies the attributes you want to modify, along with new values for those attributes. An update expression also specifies how to modify the attributes — for example, setting a scalar value or deleting elements in a list or a map. In the case of your app, you are adding the number “1” to the existing value of the Votes column of your DynamoDB table for the appropriate VotedFor value.

For example, if the “Votes” value for the DynamoDB key/value pair of { “VotedFor” : “RED.6” }, is 100, the update expression increases the Votes column (the ExpressionAttributeNames variable) by 1 (the ExpressionAttributeValues variable), giving “RED.6” a new total of 101.

In a moment, you’ll see how to use Lambda to scan these values and update the aggregations table with a similar update expression.

Server Setup

To run the web server, you may need to install Node.js on your EC2 instance. For Amazon Linux installs, use the following:

sudo yum install nodejs npm --enablerepo=epel

Then, from the application’s root, run npm install to install the necessary dependencies.

You’ll notice the app is listening on port 8080. Set up your EC2 instance to allow IP forwarding of port 80 traffic to port 8080. For Amazon Linux instances, edit /etc/sysctl.conf and change the line containing “net.ipv4.ip_forward” from a “0” to a “1”. Next, run the following code to enable your change:

sudo sysctl -p /etc/sysctl.conf

IP forwarding is now enabled. Now you need to modify your IP tables with the following three lines to make the routing changes you need:

sudo iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 8080
sudo iptables -A INPUT -p tcp -m tcp --sport 80 -j ACCEPT 
sudo iptables -A OUTPUT -p tcp -m tcp --dport 80 -j ACCEPT

Finally, start (or restart) your Node.js app, which should output “VoteApp listening on port 8080”. Visit the public IP address of your EC2 instance or your load balancer, and you now should see a simple “VoteApp demo” text message, letting you know your server is working and ready.

If you intend to keep the Node.js app running indefinitely, consider running it with Forever, a simple CLI tool for ensuring that a given script runs continuously.

Lambda Configuration

You now have a Node.js web server waiting for votes from text messages. We need a Lambda function to tally those votes.

To better spread our writes across multiple partitions, we have added votes appended with a random number to our VoteApp table.

vote partitions for Lambda

We need to associate our VoteApp table with a Lambda function that will take a batch of vote records, sum up the votes, and update the aggregation table.

To associate the VoteApp table with a Lambda function

  1. Open the Lambda console, and choose Create a Lambda function.
  2. On the Select blueprint page, you have an opportunity to select a blueprint from which to start. Here you will see commonly used starter templates for creating Lambda functions, but for this demo you will create your own. Choose Skip to configure your function.
  3. On the Configure function page, type a name and description for your function.
  4. For Runtime, choose Node.js.
  5. In the Lambda Function code section, for Code entry type, select Edit code inline.
  6. In the code editor box, copy and paste the following code:
aaconsole.log('Loading event');
var AWS = require('aws-sdk');
var dynamodb = new AWS.DynamoDB();
exports.handler = function(event, context) {

var totalRed = 0;
var totalGreen = 0;
var totalBlue = 0;

event.Records.forEach(function(record) {
        var votedForHash = record.dynamodb['NewImage']['VotedFor']['S'];
        var numVotes = record.dynamodb['NewImage']['Votes']['N'];
// Determine the color on which to add the vote
        if (votedForHash.indexOf("RED") > -1) {
            votedFor = "RED";
            totalRed += parseInt(numVotes);
        } else if (votedForHash.indexOf("GREEN") > -1) {
            votedFor = "GREEN";
            totalGreen +=  parseInt(numVotes);
        } else if (votedForHash.indexOf("BLUE") > -1) {
            votedFor = "BLUE";
            totalBlue += parseInt(numVotes);
        } else {
            console.log("Invalid vote: ", votedForHash);
        }
    });

// Update the aggregation table with the total of RED, GREEN, and BLUE
// votes received from this series of updates

var aggregatesTable = 'VoteAppAggregates';
    if (totalRed > 0) updateAggregateForColor("RED", totalRed);
    if (totalBlue > 0) updateAggregateForColor("BLUE", totalBlue);
    if (totalGreen > 0) updateAggregateForColor("GREEN", totalGreen);
function updateAggregateForColor(votedFor, numVotes) {
dynamodb.updateItem({
            'TableName': aggregatesTable,
            'Key': { 'VotedFor' : { 'S': votedFor }},
            'UpdateExpression': 'add #vote :x',
            'ExpressionAttributeNames': {'#vote' : 'Vote'},
            'ExpressionAttributeValues': { ':x' : { "N" : numVotes.toString() }
}, function(err, data) {
            if (err) {
                console.log(err);
                context.fail("Error updating Aggregates table: ", err)
            } else {
                console.log("Vote received for %s", votedFor);
                context.succeed("Successfully processed " + event.Records.length + " records.");
            }
        });    
    }
};

The code accepts batches of records from a DynamoDB stream, determines which are votes for each of our colors, keeps a running tally of votes from that particular batch, and updates the aggregates table.

7. For Handler, Lambda sets your handler name to index.handler. Keep this setting.
8. For Role, choose DynamoDB event stream role. A new tab or window will open, allowing you to create the role’s permissions.
9. For IAM role, choose Create a new IAM Role.
10. For Role Name, give your role a name.
11. Open View Policy Document, and then choose Edit.

This policy document is what allows Lambda to scan and update your database. The default policy does most of the work for you. However, because you are scanning and updating your table, you must add the Scan and Update actions to the list of allowable methods. Scroll through the policy to find the default dynamodb actions, and then add dynamodb:Scan and dynamodb:UpdateItem to the list:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "lambda:InvokeFunction"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetRecords",
        "dynamodb:GetShardIterator",
        "dynamodb:DescribeStream",
        "dynamodb:ListStreams",
        "dynamodb:Scan",
        "dynamodb:UpdateItem",
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}

12. Choose Allow to save the modified policy. You will be returned to the Lambda Configure function page.
13. You should see your new role in the Role drop-down list. Choose your new role.

Note
Consider locking your “Resource” fields to the appropriate ARNs of your resources. For this demo, a wildcard (*) will allow the access you need.

14. In the Advanced settings section, leave the settings at their defaults, and then choose Next.
15. On the Review page, review your function details, and then choose Create function.

Your function is now created, and is ready to be associated with the stream for your table.

Stream Configuration

Now that you’ve created your Lambda function, you need to associate it with a DynamoDB stream.

To associate a Lambda function with a stream

  1. On your DynamoDB dashboard, choose your table, and then choose the Streams tab within your table properties.
  2. Select the stream for the table you’ve created, and then choose Associate Lambda Function.
  3. Choose the function you created from the Select a Lambda Function drop-down list, select Trim Horizon as the starting point, and set your batch size. Changes to your table can be sent in batches to your Lambda function. Your desired batch size can be anywhere from 1 to 10,000 events. For this voting app, I have set the batch size to “50”, meaning the Lambda function will retrieve a maximum of 50 events at a time from DynamoDB. You can adjust your batch size according to your needs.
  4. Choose Associate to associate the Lambda function with the stream. Now, whenever updates are made to your table, DynamoDB Streams will notify your Lambda function.

Creating Your Dashboard

You now have a web server running to receive votes from text messages, a DynamoDB table to store them in, and a DynamoDB stream to notify a Lambda function, which adds the total amount of votes for each color and writes to an aggregate table.

Our application uses the AWS SDK for JavaScript to retrieve the results from the aggregates table and display them as a graph on the dashboard. As discussed in part one of this blog series, you could also write the vote results to a file, place it in an Amazon S3 bucket, and read the data from there. For apps with low traffic, this approach is acceptable, but for apps where you expect a high volume of traffic, and subsequently many rapid writes to your data file, you will want to take advantage of the aggregates table shown here, in part two. For more information about PUT-intensive S3 performance, see Request Rates and Performance Considerations.

Now, you need to add in the HTML, JavaScript, and CSS to create your dashboard. Feel free to use the source from our demo, http://voteapp.s3-website-us-east-1.amazonaws.com, and place the content in your S3 bucket. Your entire application should have a structure similar to the following screenshot.

HTML, JavaScript, and CSS for dasboard

In part one of this series, we enabled static website hosting on our demo bucket. You can do the same for yours by following these simple instructions. Then, test your app by either texting a vote to your Twilio number, or manually adding or changing an entry for a color inside your DynamoDB aggregates table.

Congratulations! You now have a responsive dashboard that updates itself as changes are made to DynamoDB.

Conclusion

In this series, we saw the power of DynamoDB Streams when used in conjunction with AWS Lambda. While adding data from text messages might seem like a trivial example, you can see how using DynamoDB Streams and Lambda to act on data from multiple sources greatly simplifies your need to write separate code to handle those multiple sources of input. You learned how to provision your DynamoDB tables and scale throughput as your traffic demands. With the knowledge you’ve learned here, you can apply similar techniques to more interesting projects. We can’t wait to see what you come up with!

Follow us: @AWSstartups