Why am I experiencing high latency issues with Kinesis Data Streams?

Last updated: 2020-06-04

My Amazon Kinesis data stream is experiencing high latency while getting data records. Why is this happening and how do I troubleshoot this?

Short Description

GetRecords.Latency can increase if there is an increase in record count or record size for each GET request. If you tried to restart your application while the producer was ingesting data into the stream, records can accumulate without being consumed. This increase in the record count or amount of data to be fetched increases the value for GetRecords.Latency. Additionally, if an application is unable to catch up with the ingestion rate, the IteratorAge gets increased.

Note: Enabling server-side encryption on your Kinesis data stream can also increase your latency.

Resolution

Monitor the Amazon Kinesis Data Streams service with Amazon CloudWatch. Check the CloudWatch metrics such as GetRecords.Latency to verify whether the latency increase is continuous. If the latency increase is continuous, then check if there is also an increase in the IncomingRecords, IncomingBytes, GetRecords.Records, and GetRecords.Bytes metrics in CloudWatch. As data volumes increase, these metrics also increase, causing high latency. This increase occurs because GetRecords fetches more records when there are more records available in the Kinesis data stream.

If your IteratorAge also increased, then there are likely more IncomingBytes put into the stream. Check the IncomingBytes metric in CloudWatch to verify whether the number of bytes increased. You can also check whether fewer GetRecords calls were made to the stream. More incoming bytes indicate that each GetRecords call is retrieving more data, which increases the value for GetRecords.Latency.

If you still observe high latency (even though there is no increase in IncomingBytes or IncomingRecords), there might be too much incoming data. If the consumer application is unable to catch up with the incoming data, the data continues to accumulate in the Kinesis data stream. Even if you restart the application, more records are fetched within each GetRecords call. The increase in records or fetched data for each GetRecords call then increases the value for GetRecords.Latency.

To resolve this issue, try the following troubleshooting tips:

  • Check your application to see if enough GetRecords calls are being made to process the volume of incoming data. If you are using the Amazon Kinesis Client Library (KCL) application or AWS Lambda as a consumer, increase the number of shards in your stream. An increase in shard count increases the consumption rate from the delivery stream, while decreasing the values for the IteratorAge and GetRecords.Latency.
  • Increase the retention period of the Kinesis data stream to avoid any data losses. A longer retention period can help your application to catch up with the data backlog.
  • If you have your own consumer application, check the processing logic and record processing time.
  • Check the central processing unit (CPU) and memory utilization of your system to see if you need to free up more memory.

Did this article help you?

Anything we could improve?


Need more help?