Skip to main content

Guidance for Processing Real-Time Data Using Amazon DynamoDB

Overview

This Guidance demonstrates how to use Amazon DynamoDB Streams to build near real-time data aggregations for DynamoDB tables. It outlines the configuration of DynamoDB Streams on a source table and provides sample code to implement an aggregation function. This function polls the stream, performs calculations or transformations on the data, and inserts the aggregated results into a target DynamoDB table. By integrating DynamoDB Streams with tables, you can gain visibility and near real-time insights into critical data, such as sales figures, enabling prompt inventory management and optimized operations.

How it works

The architecture diagram illustrates near real-time data aggregations in Amazon DynamoDB utilizing DynamoDB Streams and AWS Lambda. It enables efficient computation of aggregated data summaries, enhancing performance and scalability for DynamoDB applications.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

This Guidance uses DynamoDB Streams to automatically stream changes from your DynamoDB table to Lambda . This eliminates the need for you to build and maintain custom data streaming pipelines. Plus, with logging and monitoring in Amazon CloudWatch , you can quickly identify and troubleshoot any issues that may arise.

Read the Operational Excellence whitepaper

By using DynamoDB and Lambda , you benefit from robust security features built into these AWS services. Specifically, DynamoDB offers encryption for data at rest, while Lambda provides a secure, isolated execution environment. Together, these services help ensure your sensitive data is protected from unauthorized access or tampering. Additionally, this Guidance follows the principle of least privilege, granting only the necessary permissions to the Lambda function to access and process the DynamoDB data, further strengthening the overall security posture.

Read the Security whitepaper

DynamoDB , DynamoDB Streams , and Lambda are all fully managed services provided by AWS. Lambda includes features such as automatic retries and error handling mechanisms to manage issues that arise while processing incoming data from the streams. All the services used throughout this Guidance can automatically scale under high loads, don't require downtime for patching, and are fault-tolerant by design with retry mechanisms built in.

Read the Reliability whitepaper

This Guidance uses DynamoDB Streams to invoke Lambda functions. Utilizing these services allows you to access aggregated data efficiently without the need for costly table scans, which can be time-consuming and impact system latency. Invoking Lambda functions through DynamoDB Streams not only improves data retrieval efficiency but also addresses potential latency issues that arise from scanning large datasets, enhancing the overall performance of your system.

Read the Performance Efficiency whitepaper

With DynamoDB and Lambda , you pay only for the resources you use, eliminating the need to manage hardware. And since DynamoDB Streams is a built-in feature of DynamoDB , streaming data incurs no extra charge. This approach of streaming data from DynamoDB Streams , aggregating it through Lambda functions, and writing the aggregated data back to DynamoDB is more cost-effective compared to performing full table scans for aggregations. It's also more economical than streaming data to a separate database for such calculations.

Read the Cost Optimization whitepaper

This Guidance supports sustainable workloads through the serverless architecture of Lambda , which optimizes resource allocation and reduces the need to maintain physical hardware. The Lambda functions are only invoked when there is a change in the data in the base DynamoDB table, thus reducing the compute resource run times and the number of executions. This approach helps eliminate the need to maintain physical infrastructure, contributing to more sustainable workloads.

Read the Sustainability whitepaper

Get Started

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open sample code on GitHub

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.