AWS Compute Blog

Service Discovery: An Amazon ECS Reference Architecture

My colleagues Pierre Steckmeyer, Chad Schmutzer, and Nicolas Vautier sent a nice guest post that describes a fast and easy way to set up service discovery for Amazon ECS.

Microservices are capturing a lot of mindshare nowadays, through the promises of agility, scale, resiliency, and more. The design approach is to build a single application as a set of small services. Each service runs in its own process and communicates with other services via a well-defined interface using a lightweight mechanism, typically HTTP-based application programming interface (API).

Microservices are built around business capabilities, and each service performs a single function. Microservices can be written using different frameworks or programming languages, and you can deploy them independently, as a single service or a group of services.

Containers are a natural fit for microservices. They make it simple to model, they allow any application or language to be used, and you can test and deploy the same artifact. Containers bring an elegant solution to the challenge of running distributed applications on an increasingly heterogeneous infrastructure – materializing the idea of immutable servers. You can now run the same multi-tiered application on a developer’s laptop, a QA server, or a production cluster of EC2 instances, and it behaves exactly the same way. Containers can be credited for solidifying the adoption of microservices.

Because containers are so easy to ship from one platform to another and scale from one to hundreds, they have unearthed a new set of challenges. One of these is service discovery. When running containers at scale on an infrastructure made of immutable servers, how does an application identify where to connect to in order to find the service it requires? For example, if your authentication layer is dynamically created, your other services need to be able to find it.

Static configuration works for a while but gets quickly challenged by the proliferation and mobility of containers. For example, services (and containers) scale in or out; they are associated to different environments like staging or prod. You do not want to keep this in code or have lots of configuration files around.

What is needed is a mechanism for registering services immediately as they are launched and a query protocol that returns the IP address of a service, without having this logic built into each component. Solutions exist with trade-offs in consistency, ability to scale, failure resilience, resource utilization, performance, and management complexity. In the absence of service discovery, a modern distributed architecture is not able to scale and achieve resilience. Hence, it is important to think about this challenge when adopting a microservices architecture style.

Amazon ECS Reference Architecture: Service Discovery

We’ve created a reference architecture to demonstrate a DNS- and load balancer-based solution to service discovery on Amazon EC2 Container Service (Amazon ECS) that relies on some of our higher level services without the need to provision extra resources. There is no need to stand up new instances or add more load to the current working resource pool.

Alternatives to our approach include directly passing Elastic Load Balancing names as environment variables – a more manual configuration – or setting up a vendor solution. In this case, you would have to take on the additional responsibilities to install, configure, and scale the solution as well as keeping it up-to-date and highly available.

The technical details are as follows: we define an Amazon CloudWatch Events filter which listens to all ECS service creation messages from AWS CloudTrail and triggers an Amazon Lambda function. This function identifies which Elastic Load Balancing load balancer is used by the new service and inserts a DNS resource record (CNAME) pointing to it, using Amazon Route 53 – a highly available and scalable cloud Domain Name System (DNS) web service. The Lambda function also handles service deletion to make sure that the DNS records reflect the current state of applications running in your cluster.

There are many benefits to this approach:

  • Because DNS is such a common system, we guarantee a higher level of backward compatibility without the need for “sidecar” containers or expensive code change.
  • By using event-based, infrastructure-less compute (AWS Lambda), service registration is extremely affordable, instantaneous, reliable, and maintenance-free.
  • Because Route 53 allows hosted zones per VPC and ECS lets you segment clusters per VPC, you can isolate different environments (dev, test, prod) while sharing the same service names.
  • Finally, making use of the service’s load balancer allows for health checks, container mobility, and even a zero-downtime application version update. You end up with a solution which is scalable, reliable, very cost-effective, and easily adoptable.

We are excited to share this solution with our customers. You can find it at the AWS Labs Amazon EC2 Container Service – Reference Architecture: Service Discovery GitHub repository. We look forward to seeing how our customers will use it and help shape the state of service discovery in the coming months.