AWS Official Blog

Amazon SQS – New Dead Letter Queue

by Jeff Barr | on | in Amazon SQS |

The Amazon Simple Queue Service (SQS) makes it easy for you to decouple the components of your application from each other. Proper use of SQS can make your applications easier to build, scale, and run. For example, you could have one task produce work items, post them to a queue, and have another task pull them from the queue, process them, and write the results to a database. By inserting a queue between the two primary tasks, each can run independently of the other.

Dead Letter Queue
In order to give you more control over message handling in SQS queues, we are introducing the concept of a Dead Letter Queue (DLQ). You exercise control over the Dead Letter Queue using a Redrive Policy, which contains two values:

  • Maximum Receives – The maximum number of times that a message can be received by consumers. When this value is exceeded for a message the message will be automatically sent to the Dead Letter Queue.
  • Dead Letter Queue – The ARN (Amazon Resource Name) of an SQS queue that will receive the messages which were not successfully processed after maximum number of receives by consumers.

You can set up these values in the AWS Management Console or through the SQS APIs. You can also provision the queue using AWS CloudFormation (see this sample template for details). The designated dead letter queue must be in the same account and AWS Region as the original queue.

Why Is The Dead Letter Queue Helpful?
If you are new to Amazon SQS, you might be wondering how you would make use of the Dead Letter Queue in your applications. Let’s start with the basics and take it from there.

Each SQS queue is named, and can contain any number of messages. Your application must create any necessary queues and then use three primary SQS operations – SendMessage, ReceiveMessage, and DeleteMessage.

The SendMessage operation is used to add a message to a message queue. The producer task would make use of this operation.

The ReceiveMessage operation is used to retrieve a message from a message queue. The consumer task retrieves a message, processes it, and then calls DeleteMessage to remove it from the queue. The message remains in the queue, safely hidden away, between the calls to ReceiveMessage and DeleteMessage.

There is a visibility timeout associated with each queue. If a message is received but not deleted in the time interval specified by the timeout, the message becomes visible once again and a future call to ReceiveMessage will return it. The consumer must call DeleteMessage to tell SQS that the message has been processed successfully.

Suppose that the consumer receives a particular message and can’t process it for some reason. The message will be processed a second time after the visibility timeout expires and the message is returned by another call to ReceiveMessage.

However, what if the message itself causes the consumer to fail? Perhaps the message refers to a resource that cannot be accessed within the visibility timeout, or it invokes a computation that takes too long to complete. This could lead to trouble, as there is no way for the message to be taken out of the system and the message would recirculate endlessly!

With the Dead Letter Queue, stuck messages will become a thing of the past. When the maximum receive count is exceeded for a message it will be moved to the Dead Letter Queue associated with the original queue. You will probably want to set up a separate consumer process on the Dead Letter Queue so that you can log, analyze, and hopefully understand the source of the problem.

You can read the Setting up Dead Letter Queue documentation to learn more.

This new feature is available today and you can start using it now.

– Jeff;