AWS Database Blog
Rate-limiting calls to Amazon DynamoDB using Python Boto3, Part 1
In this post, I present a technique where a Python script making calls to Amazon DynamoDB can rate limit its consumption of read and write capacity units. The technique uses Boto3 event hooks to apply the rate limiting without having to modify the client code performing the read and write calls. It’s a proven solution used in the Bulk Executor for Amazon DynamoDB and its logic is available encapsulated within the DynamoDBMonitor class on GitHub.
In Part 2 I show you how to coordinate rate limits across separate processes using an Amazon Simple Storage Service (Amazon S3) folder for shared state.
Reasons to rate limit
I’ve previously published several blogs that illustrate scenarios where rate limiting can be useful:
- To avoid table-level throttling, especially when in provisioned mode – For example, when performing a bulk task, you might want to run at a controlled rate to avoid spikes that generate throttles before automatic scaling can adjust. See Handle traffic spikes with Amazon DynamoDB provisioned capacity, options 4 and 5.
- To avoid partition-level throttling – If your reads and writes might be hitting the same partition, you can rate limit to avoid creating hot partitions that generate throttles and impact other activity to the same partition. See Scaling DynamoDB: How partitions, hot keys, and split for heat impact performance.
- To control costs – You might want to bound consumption to limit spend, or choose precise consumption to better utilize reserved capacity. See Cost-effective bulk processing with Amazon DynamoDB.
Overview of solution
The general approach to DynamoDB rate limiting with any SDK language is this:
- Have the request send a
ReturnConsumedCapacityparameter indicating it would like to know how much capacity was consumed. You can’t know the consumption in advance because, for example, aDeleteItemwill consume write capacity based on the size of the item being deleted and aTransactWriteItemsmight consume either read or write capacity, because of how theClientRequestTokengets handled - Pulls the consumption from the
ConsumedCapacitydata structure added to each response. - Track consumption over time and add short delays to rate limit, if needed.
For rate limiting with the Python SDK, as explained in Programming Amazon DynamoDB with Python and Boto3, there’s an event system that enables runtime extensibility. The Boto3 library has named event hooks where you can register a function to be called at certain points of the request/response process:
- provide-client-params – Called early in the request processing. This gives us the opportunity to add the parameter
ReturnConsumedCapacityif it’s not already present. - before-send – Called right before the request gets sent. This gives us the opportunity to add delays if required.
- after-call – Called during the response handling. This gives us the opportunity to track the
ConsumedCapacity.
By adding these hooks on the Session object, they will automatically run during calls to DynamoDB for clients built against that same Session object. This logic is encapsulated in a Python class called DynamoDBMonitor that’s about 100 lines of code. The following is a sample usage:
Code walkthrough
This section reviews the essential parts of the DynamoDBMonitor code.
The constructor takes the session (on which the event hooks will be applied) and the maximum read and write rates. It creates a read bucket and write bucket to track read and write consumption. Each bucket constructor accepts a per-second accumulation rate, an initial quantity of tokens to start with, and a maximum overall bucket capacity (after which any further accumulation spills out). Here we choose to start with an initial quantity equal to one second of accumulation, with a maximum capacity equal to two seconds of accumulation. This configuration aims to let processing get started quickly and allow some bursty usage.
The class then registers the event hooks to run our callbacks during request and response processing.
The provide-client-params event hook adds the ReturnConsumedCapacity parameter if not already specified:
The before-call event hook has to decide if a sleep is required. It makes a call to the appropriate bucket asking it to wait until its token count is above zero.
The after-call event hook reads the ConsumedCapacity data structure out of the response and updates metrics. Tracking is only for the base table activities, not global secondary indexes (GSIs), and read/write metrics aren’t isolated per table.
You can find the code for DynamoDBMonitor in the GitHub repo as part of the Bulk Executor for DynamoDB project. The code for TokenBucket is also there.
Other languages
This code relies on Python’s Boto3 event hook system. JavaScript and Java have similar systems for extensibility:
- JavaScript has a middlewareStack (see Programming Amazon DynamoDB with JavaScript)
- Java v2 has an ExecutionInterceptor (see Programming DynamoDB with the AWS SDK for Java 2.x)
Conclusion
Using the Boto3 event hook system makes it possible to track read and write capacity and add sleeps to impose rate limits without modifying the code performing the reads and writes. The DynamoDBMonitor class encapsulates this logic in reusable form. In Part 2, I show how to coordinate rate limits across separate processes using an S3 folder for shared state.
Try out this solution for your own use case, and share your feedback in the comments.