AWS Database Blog
Implement event-driven architectures with Amazon DynamoDB
Event-driven architectures are a popular choice among customers for integrating different AWS services and heterogeneous systems. These architectures can help reduce costs, independently scale and fail components, and support parallel processing. For applications using DynamoDB, some may need more enhanced Time-to-Live (TTL) functionalities, while others require the ability to trigger downstream actions, such as sending reminder emails for events managed by the application. Whether you need immediate data eviction or precise control over scheduling future events, event-driven architectures can help you achieve these goals efficiently.
In this three-part series, we explore approaches to implement enhanced event-driven patterns for DynamoDB-backed applications. Here’s a sneak peek at the topics we’ll cover:
- Part 1: Leveraging Amazon EventBridge Scheduler for precise data eviction – Discover how to use EventBridge Scheduler to efficiently manage and evict data from DynamoDB with near real-time precision.
- Part 2: Utilizing a purpose-built Global Secondary Index for strict data management – Learn about creating a specialized Global Secondary Index (GSI) to precisely control data eviction and management.
- Part 3: Implementing Amazon EventBridge Scheduler for fine-grained event scheduling – Explore how EventBridge Scheduler can enable fine-grained scheduling of downstream events, allowing for precise future event management.
In this post (Part 1), we focus on improving DynamoDB’s native TTL functionality by implementing near real-time data eviction using EventBridge Scheduler, reducing the typical time to delete expired items from within a few days to less than one minute.
DynamoDB Time-to-Live: Native Functionality and Limitations
Time to Live (TTL), a native functionality in DynamoDB, allows you to define a specific expiration time for items in a table. When TTL is enabled and an expiration attribute is set for an item, DynamoDB automatically removes the item from the table when its expiration time is reached. This feature is commonly used to manage data with a limited lifespan, such as temporary session records, cached information, or other time-sensitive data. With TTL, you can automate the process of data cleanup, optimizing data management, cost, and storage efficiency.
To use native TTL, you must enable it on the table and specify an attribute that will contain the expiration time for each item. This attribute must be in Unix epoch time format. Native TTL does not consume write throughput on your table and incurs no additional cost except for case of global tables, where TTL deletes on source region incur no cost, but replicated deletes to other replica tables consume write throughput.
However, although the native TTL feature is effective in automatically expiring items, its inherent delay of up to a few days before deletion may not align with every application’s data management or compliance requirements. This discrepancy becomes particularly pertinent for systems demanding rapid and precise data removal, such as e-commerce platforms needing swift inventory updates or dynamic user sessions that demand accurate real-time tracking. The item will continue to appear in query results until the deletion is performed. An entry will appear in any configured change stream when the deletion occurs.
Solution Overview: Near Real-Time TTL with EventBridge Scheduler
To address these time-sensitive use cases, we can integrate EventBridge Scheduler to enforce timely eviction of items within 1-2 minutes, providing a more immediate and predictable solution for applications requiring rapid data expiration.
Benefits of Near Real-Time TTL
- Near real-time accuracy: Ensures data that is no longer needed is promptly removed, which is essential for applications like ecommerce where accurate inventory levels are critical to prevent overselling or underselling products.
- Enhanced user experience: Immediate removal of expired session data ensures users interact with the most current information, providing a better and immediate experience.
- Timely notifications: Supports the execution of any dependent processes relying on the expiration of data, such as session timeouts or temporary access permissions, are executed exactly when needed.
- Optimized resource utilization: Promptly frees up storage space by evicting expired data, which reduces costs associated with storing outdated information.
- Improved data management: Only relevant and current data is retained, simplifying data management and making it easier to maintain data integrity.
Solution Architecture
The following diagram illustrates our solution architecture:
The solution contains the following key components:
- Amazon DynamoDB – This fully managed, serverless, distributed NoSQL database is designed to run high-performance applications at any scale. The table must contain an attribute indicating the item expiration time.
- DynamoDB Streams – This native functionality captures a time-ordered sequence of item-level modifications in your DynamoDB Table. These modifications include Inserts, Updates, and Delete operations.
- AWS Lambda – This serverless compute service lets you run code without provisioning or managing servers.
- Amazon EventBridge Scheduler – This serverless scheduler allows you to create, run, and manage tasks from one central, managed service.
How It Works
For each DynamoDB item with a TTL value, we associate a one-time invocation schedule in EventBridge Scheduler. The one-time invocation is scheduled to trigger at the same time as the DynamoDB Item TTL value.
Each schedule must include a target to be used when the schedule is invoked. We will use a Universal Target to directly call the DynamoDB DeleteItem API, removing the item from the table. Directly integrating with DynamoDB is more cost effective than invoking a Lambda to perform the delete.
The flow works as follows:
- When a record is added, modified, or removed from the DynamoDB table, a stream record is generated in the associated DynamoDB Stream.
- The DynamoDB Stream invokes a Lambda function that processes the stream event.
- The Lambda function extracts the item’s primary key and TTL value, then creates, updates, or deletes an EventBridge schedule accordingly.
- The EventBridge Schedule is configured to invoke the DynamoDB DeleteItem API at the specified TTL time, removing the item from the table.
- The schedule automatically deletes itself after successful completion.
The EventBridge Schedule associated to an item is set to trigger at the specified item TTL time. Future dated schedules will typically fire within one minute of the schedule time when not using the flexible time window feature.
Prerequisites
Before implementing this solution, you should have:
- AWS account – Access to an active AWS account to test the solution
- DynamoDB basics – A foundational understanding of DynamoDB concepts
- Lambda functions – Familiarity with Lambda for processing DynamoDB Streams events
- EventBridge basics – Basic knowledge of EventBridge for setting up scheduler rules
- AWS CLI or console proficiency – For configuring services and monitoring logs. We use the AWS Management Console throughout this post.
Implementation Steps
First, you create a DynamoDB table with DynamoDB Streams enabled
- On the DynamoDB console, choose Tables in the navigation pane.
- Choose Create table.
- For Table name, enter a name for your new table (such as: TTL-table).
- For Partition key, enter PK as the name and choose String as the type.
- For Sort key, enter SK as the name and choose String as the type.
- Leave all other configurations as default and choose Create table.
- Choose Tables in the navigation pane and open your table details.
- On the Exports and streams tab, under DynamoDB stream details, choose Turn on.
- In the Turn on DynamoDB stream wizard, select New and old images, then choose Turn on stream.
- You should now see the DynamoDB stream details, with the “Stream Status” set to “On”. Make sure to note the “Latest stream ARN”, you will need this later.
This will enable DynamoDB Streams on the table to expose both the old and new state of items in the stream records, so you can manage updates on items’ TTL values. Next, we create the Lambda function that is invoked by the DynamoDB Stream.
Configure IAM Permissions
The EventBridge Scheduler needs appropriate permissions to invoke the DynamoDB DeleteItem API. Create an IAM role with the following trust policy:
And attach a policy with the following permissions:
Take note of the ARN for this role, as we will need it later.
Create a Lambda function
The Lambda function is responsible for configuring EventBridge Scheduler to perform selective deletions of specific items from the DynamoDB table:
- On the Lambda console, choose Functions in the navigation pane.
- Choose Create function.
- Select Author from scratch.
- For Function name, enter a name (for example,
DDBStreamTriggerEventScheduler
). - Select a
Runtime
. While this post uses Python 3.11, feel free to choose any runtime you’re familiar with. - To the Lambda execution role, add the IAM managed policy
AWSLambdaDynamoDBExecutionRole
and aninline policy
withscheduler:CreateSchedule
permission. - Choose Create function.
- Click the Configuration tab of your Lambda Function, and select the Permissions pane.
- Under the Execution role, click the linked Role name. This should look something similar to
YourLambdaFunctionName-Role-abc
. - Choose Add permissions, then Create inline policy
- Switch from Visual to JSON and add the following policy, giving your Lambda function permission to access the DynamoDB Stream (be sure to enter the DynamoDB Stream ARN you noted earlier) and create, update, and delete EventBridge Schedules:
- Choose Next
- Enter a Policy name, such as
AWSLambdaToEventBridgeScheduler
- Choose Create policy
- Navigate back to the DynamoDB console, choose Tables in the navigation pane and select the table you created earlier (TTL-Table)
- Navigate to the Exports and streams tab.
- In the Trigger pane, select Create trigger.
- Under the AWS Lambda function details section, select the new Lambda function you created earlier (
DDBStreamTriggerEventScheduler
)
- Choose Create trigger
- On the Lambda console, choose Functions in the navigation pane, and select the Lambda function you created earlier.
- On the Code tab of the Lambda function, add the appropriate code.
Lambda Function Code Example (Python)
- Choose Deploy to deploy the latest function code.
- Lastly, we need to add our DynamoDB table name and Eventbridge Scheduler role ARN to our Lambda Function’s environment variables. Navigate to the Configuration tab and the Environment variables pane.
- Choose Edit
- Select Add environment variable and enter:
- Key: DYNAMODB_TABLE_NAME
- Value: Your table name (such as TTL-Table)
- Select Add environment variable again, and enter:
- Key: SCHEDULER_ROLE_ARN
- Value: The scheduler role ARN you noted earlier (should look similar to
arn:aws:iam::************:role/eventbridge_scheduler_role
)
- Select Save
Testing the Solution
You can test the solution by adding items to your DynamoDB table with TTL values. Here’s an example that creates 10 sample items with a TTL value using the AWS CLI:
You can navigate to the Monitoring tab of the EventBridge schedule group to see deletes being run. The schedule invokes operations at a specific time; you can observe these invocations by looking at the InvocationAttemptCount
metric. In our case, the invocations are deletes issued to the DynamoDB table. For a list of all metrics available for a schedule group, refer to Monitoring Amazon EventBridge Scheduler with Amazon CloudWatch.
Cost Considerations
The cost of using this approach for 1,000,000 TTL items is estimated in the following table with a comparison to using the native DynamoDB functionality. Each DynamoDB item is less than 1KB in size and stored in a table using on-demand mode in the us-east-1 Region. Free tiers are not considered for this analysis.
–
Near-realtime TTL | Native DynamoDB TTL | |
DynamoDB Stream | $0 (DynamoDB streams is free of charge when consumed by Lambda) | – |
Lambda | ||
EventBridge Scheduler | $1 | – |
DynamoDB Delete | $0.63 | – |
Total Cost | $1.63 | $0 |
Limitations
An upper limit for this solution is related to the Eventbridge Scheduler request rate quotas. The CreateSchedule, UpdateSchedule, and DeleteSchedule requests each default to a maximum of 1,000 requests per second for most regions. These limits are adjustable. If this limit is exceeded, EventBridge Scheduler rejects any requests for the operation for the remainder of the interval. A dead-letter queue (DLQ) can be used to capture and re-drive failed executions.
EventBridge Scheduler also enforces a throttle limit on concurrent target invocations, which defaults to 1,000 invocations per second in most regions. This refers to the rate at which schedule payloads are delivered to their targets. If this limit is exceeded, invocations are not dropped but throttled, meaning they are delayed and processed later. This quota is adjustable and can scale to tens of thousands of invocations per second.
Lastly, another upper limit to this solution would be the concurrent executions quota available for your Lambda functions, which defaults to 1,000 per second per account. This is an important limit to consider, especially if you have other Lambda functions running in the same account. If you do reach the concurrency limit, your function will be throttled. This limit can be increased.
Clean Up
If you created a test environment to follow along this post, make sure to:
- Delete the DynamoDB table
- Delete the Lambda function
- Delete the EventBridge schedule
- Delete any remaining IAM roles created during this process
- Delete any other resources you created for testing the solution.
Conclusion
In this post, we explored how you can use Amazon EventBridge Scheduler for implementing near real-time TTL for DynamoDB, reducing the time to delete an item after TTL expiry from up to a few days to typically less than 1-2 minutes.
This serverless solution bridges the gap between the built-in capabilities of DynamoDB and the need for immediate, controlled item evictions. While it does incur some additional costs compared to the native TTL functionality, it effectively addresses the need for rapid and reliable data eviction.
The approach can scale to billions of active TTL delete records, with the time to delete increasing once concurrent deletes scale past the EventBridge Scheduler invocation limit.
In Part 2, we’ll dive deeper into implementing strict data eviction within DynamoDB using a Global Secondary Index. The series wraps up with Part 3, where we’ll use Amazon EventBridge Scheduler for fine-grained event scheduling. These event-driven patterns will help you automate key processes, maintain accurate data, and meet precise application requirements with minimal manual effort.