Understanding asynchronous messaging for microservices

This post is courtesy of Dirk Fröhner, Sr. Solutions Architect

One of the implications of applying the microservices architectural style is that much communication between components happens over the network. After all, your microservices landscape is a distributed system. To achieve the promises of microservices, such as being able to individually scale, operate, and evolve each service, this communication must happen in a loosely coupled and reliable manner.

A common way to loosely couple services is to expose an API following the REST architectural style. REST APIs are based on the architecture of the web and provide loose coupling between communicating parties. REST APIs offer a great way to decouple interfaces from concrete implementations, and to advise clients about what they can do next, by the use of links and link relations.

While REST APIs are common and useful in microservices design, REST APIs tend to be designed with synchronous communications, where a response is required. A request coming from an end-user client can trigger a complex communications path within your services landscape, which can effectively add coupling between the services at runtime. After all, this is why there are mitigation patterns like circuit-breaker in the first place. REST APIs can also add some heavy lifting to your infrastructure that we will discuss further below.

Asynchronous messaging

If loose-coupling is important, especially in a system that requires high resilience and has unpredictable scale, another option is asynchronous messaging.

Asynchronous messaging is a fundamental approach for integrating independent systems, or building up a set of loosely coupled systems that can operate, scale, and evolve independently and flexibly. As our colleague Tim Bray said, “If your application is cloud-native, or large-scale, or distributed, and doesn’t include a messaging component, that’s probably a bug.” In this blog post, we will outline some fundamental benefits of asynchronous messaging for the communications between microservices.

For a refresher on the fundamental messaging patterns and their implementations with Amazon SQS, Amazon SNS, and Amazon MQ, please read our previous blog posts

For a summary of the semantics of queues and topics:

A queue is like a buffer. You can put messages into a queue, and you can retrieve messages from a queue. Message queues operate so that any given message is only consumed by one receiver, although multiple receivers can be connected to the queue.
A topic is like a broadcasting station. You can publish messages to a topic, and anyone interested in these messages can subscribe to the topic. In this model, any message published to a topic is immediately received by all of the subscribers of the topic (unless you have applied the message filter pattern).

Use-case

Consider a typical scenario illustrated in the diagram below. An end-user client (EUC) addresses an API resource of one of our services, through Amazon API Gateway in this example. From there, the request can potentially follow a path across the microservices landscape to get completely processed.

To provide the final result, there will be potentially cascading subsequent requests sent between other microservices. This example illustrates the complexity involved in processing a single end user request.

Diagram 1: End-user client accessing a service using an API

End-user clients (EUCs) often communicate with services via REST APIs in a synchronous manner. However, the communication can also be designed using an asynchronous approach. For instance, if an EUC submits a request that takes some time to process, the respective API resource can respond with HTTP status 202 Accepted, and a link to a resource that provides the current processing status. Downstream, the communication between the service that receives that request, and other services that are involved in processing the request, can happen asynchronously using messaging services.

There are situations where a communications model using asynchronous messaging can make your life easier than using REST APIs.

Infrastructure complexity

Start with looking at the infrastructure complexity for the backends of your services. Depending on your implementation paradigm, you have to include different components in your infrastructure that you don’t have to deal with when using messaging.

Imagine your services each expose a REST API. Typically, this means you add a load balancer in front of your compute layer, and your backend implementation includes an HTTP server. It is usually a good idea to decouple your services APIs from their concrete implementations, so you could also consider adding Amazon API Gateway in front of your load balancer.

For a serverless approach, you don’t need to worry about load balancing and scaling out infrastructure. Amazon API Gateway with AWS Lambda integration provides a fully managed solution for removing complexities around infrastructure management.

Using Amazon SQS as a cloud-native messaging service for queues, you don’t employ any of the above mentioned components. As described in a prior post, an SQS queue can act as a load balancer in itself. The consumers, or target services, don’t need an HTTP server, but simply ask a queue for available messages. If you use AWS Lambda for your consumers, this process is even simpler, as the Lambda functions are automatically invoked when messages appear in an SQS queue. See Using AWS Lambda with Amazon SQS to learn more.

The same applies to Serverless architectures implementing a publish/subscribe pattern. Lambda function executions can be directly triggered by SNS messages. Without AWS Lambda, you need load balancers and web servers in your backend implementations to receive SNS notifications, as those are injected via web hooks into your services. SNS also provides the fan-out functionality that you would otherwise have to build using an intermediary component to implement a recipient list of subscribers.

Reliability, resilience

For synchronous systems, if a service crashes while it processes the payload of an API request, the information is lost. A good way to prevent this on a microservice is to explicitly persist an incoming request immediately after receiving it. Then process and reprocess, until the request is finally marked as resolved.

This approach requires additional work, and it requires the microservice to not crash while persisting an incoming API request. The microservice sending a request must also resend if the target service doesn’t acknowledge receipt. For example, it doesn’t respond with a successful HTTP status code, or the connection drops.

When sending messages to a queue, this additional work is addressed by the messaging infrastructure. A message will remain in a queue unless a consumer explicitly states that processing is finished by acknowledging the message reception. As long as message reception is not acknowledged by a consumer, it will stay in the queue. Messages can be retained in an SQS queue for a maximum of 14 days.

Scale out latency

Under increased load, your services must scale out to process the requests. You must then consider scale-out latency, which may be managed for you with serverless implementations. It takes a few moments from when an Auto Scaling group triggers the launch of additional instances until these are ready to operate. Also launching new container tasks takes time. When your scaling threshold is not optimal and the scaling event occurs late, your available resources may be unable to serve all incoming requests. These requests may be lost or answered with HTTP status code 5xx.

Using message queues that buffer messages during a scaling event help prevent this. Even in use cases where the EUC is waiting for an immediate response, this is the more reliable architecture. If your infrastructure needs time to scale out and you are not able to process all requests in time, the requests are persisted.

When messaging is your only choice

What happens when your services must respond to peak loads at scale?

For many applications, the scale-out latency, including load balancer pre-warming, will eventually become too large to handle steeply ascending loads fast enough. With a serverless architecture, exposing your Lambda functions with API Gateway can handle steeply ascending loads. But you must still consider downstream systems, which may be easily overwhelmed.

In these scenarios, where rapid scaling without overwhelming downstream systems is important, messaging may be your best choice. Message queues help protect your downstream services by buffering incoming payloads for consumption at the pace of the consuming service. This helps not only for the communications between microservices, but also when peak loads flood your client-facing API. Often, the most important goal is to accept an incoming request, while the actual processing of that request can happen later. You decouple these steps from each other by using queues.

Serverless messaging systems like Amazon SQS and Amazon SNS can respond quickly to support high scale. These are often the best solution when scale is unpredictable. While the instance-based messaging system, Amazon MQ, provides compatibility with open standards, it requires manual scaling for large workloads, unlike serverless messaging services.

Conclusion

We hope you got some inspiration to also employ asynchronous messaging for your microservices communications architecture. In our next blog on fan-out strategies, we provide concrete examples of these patterns. For more information, feel free to consume the following resources:

Read the next blog in the series, Application Integration Patterns for Microservices: Fan-out Strategies.

AWS Compute Blog