Why is my Lambda IteratorAge metric increasing?
Last updated: 2019-08-05
I'm seeing an increase or spikes in my AWS Lambda function's IteratorAge metric. Why is this happening, and how should I handle it?
For stream-based invocations, Lambda emits the IteratorAge metric. Iterator age is the time between when the last record in a batch was recorded and when Lambda reads the record. Iterator age depends on other parameters, such as Lambda function execution duration, shard count, and batch size. For more information, see AWS Lambda CloudWatch Metrics.
In general, iterator age increases when a function can't keep up with processing the amount of data that's being written to the streams.
Review how each of these Lambda function parameters and use cases affect iterator age. Then, reconfigure your function to decrease the iterator age.
A high execution duration for your Lambda function can result in a high iterator age. The duration is the amount of time that it takes to process a batch of records. Decreasing the duration increases your throughput, which decreases the iterator age.
To decrease your function's duration, try:
- Increasing the amount of memory allocated to the function. This can allow your function to process records faster, which decreases the iterator age.
- Optimizing your function code (for example, using calls in parallel, using async methods, and so on) to take less time to process records.
Increasing the number of shards in a stream decreases the iterator age, assuming that records are evenly distributed, which is a best practice. This is because the number of shards in a stream corresponds to your Lambda function's maximum concurrency. Basically, more shards means higher concurrency, and therefore more throughput. For more information, see Stream Event Invokes.
Note: Shard splitting doesn't have an immediate effect on the iterator age. Existing records remain in the shards they were written to, and those shards need to catch up on their backlog before the iterator age for those shards decreases.
Depending on how your Lambda function works, changing the batch size might decrease the iterator age.
Say, for example, that your function's duration is mostly independent of the number of records in an event, such as when downstream calls occur in a batch. In this case, increasing the batch size increases throughput, decreasing the iterator age.
However, your function's duration might be heavily dependent on the number of records in an event, such as when each record triggers a synchronous downstream call. In this case, tuning the batch size might not effectively decrease the iterator age.
For more information, see Stream Event Invokes.
Invocation errors can cause your Lambda function to take longer to process an event or process the same event repeatedly. Because event records are read sequentially, your function can't progress to later records if a batch of records causes an error each time it's retried. In these cases, the iterator age increases linearly as those records age.
It's important that your function gracefully handles any records written to the stream. As you develop your function, logging and instrumenting your code can help you diagnose errors. For more information, see Monitoring and Troubleshooting Lambda Applications and Instrumenting AWS Lambda Functions.