[SEO Subhead]
This Guidance demonstrates how to use Amazon DynamoDB Streams to build near real-time data aggregations for DynamoDB tables. It outlines the configuration of DynamoDB Streams on a source table and provides sample code to implement an aggregation function. This function polls the stream, performs calculations or transformations on the data, and inserts the aggregated results into a target DynamoDB table. By integrating DynamoDB Streams with tables, you can gain visibility and near real-time insights into critical data, such as sales figures, enabling prompt inventory management and optimized operations.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Amazon API Gateway inserts a new item into the Amazon DynamoDB source table.
Step 2
DynamoDB automatically sends the new item to the associated Amazon DynamoDB Streams, which capture the item mutation.
Step 3
AWS Lambda polls the configured DynamoDB Streams four times per second and runs the Aggregation function.
Step 4
The Lambda function performs the required aggregation logic and inserts the data into the target DynamoDB table (which could also be the same table).
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This Guidance uses DynamoDB Streams to automatically stream changes from your DynamoDB table to Lambda. This eliminates the need for you to build and maintain custom data streaming pipelines. Plus, with logging and monitoring in Amazon CloudWatch, you can quickly identify and troubleshoot any issues that may arise.
-
Security
By using DynamoDB and Lambda, you benefit from robust security features built into these AWS services. Specifically, DynamoDB offers encryption for data at rest, while Lambda provides a secure, isolated execution environment. Together, these services help ensure your sensitive data is protected from unauthorized access or tampering. Additionally, this Guidance follows the principle of least privilege, granting only the necessary permissions to the Lambda function to access and process the DynamoDB data, further strengthening the overall security posture.
-
Reliability
DynamoDB, DynamoDB Streams, and Lambda are all fully managed services provided by AWS. Lambda includes features such as automatic retries and error handling mechanisms to manage issues that arise while processing incoming data from the streams. All the services used throughout this Guidance can automatically scale under high loads, don't require downtime for patching, and are fault-tolerant by design with retry mechanisms built in.
-
Performance Efficiency
This Guidance uses DynamoDB Streams to invoke Lambda functions. Utilizing these services allows you to access aggregated data efficiently without the need for costly table scans, which can be time-consuming and impact system latency. Invoking Lambda functions through DynamoDB Streams not only improves data retrieval efficiency but also addresses potential latency issues that arise from scanning large datasets, enhancing the overall performance of your system.
-
Cost Optimization
With DynamoDB and Lambda, you pay only for the resources you use, eliminating the need to manage hardware. And since DynamoDB Streams is a built-in feature of DynamoDB, streaming data incurs no extra charge. This approach of streaming data from DynamoDB Streams, aggregating it through Lambda functions, and writing the aggregated data back to DynamoDB is more cost-effective compared to performing full table scans for aggregations. It's also more economical than streaming data to a separate database for such calculations.
-
Sustainability
This Guidance supports sustainable workloads through the serverless architecture of Lambda, which optimizes resource allocation and reduces the need to maintain physical hardware. The Lambda functions are only invoked when there is a change in the data in the base DynamoDB table, thus reducing the compute resource run times and the number of executions. This approach helps eliminate the need to maintain physical infrastructure, contributing to more sustainable workloads.
Implementation Resources
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
Build aggregations for Amazon DynamoDB tables using Amazon DynamoDB Streams
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.