Migrate existing Amazon ECS services from service discovery to Amazon ECS Service Connect
At re:Invent in November 2022 we announced a new Amazon Elastic Container Service (Amazon ECS) solution for service-to-service communication called Amazon ECS Service Connect. Amazon ECS Service Connect enables easy communication between microservices and across Amazon Virtual Private Clouds (Amazon VPCs) by leveraging AWS Cloud Map namespaces and logical service names. This allows you to seamlessly distribute traffic between your Amazon ECS tasks without having to deploy, configure, and maintain load balancers.
Today’s post focuses on how to migrate your existing Amazon ECS tasks from using service discovery and internal load balancers to Amazon ECS Service Connect.
To demonstrate how easy it is to migrate your existing Amazon ECS services, we’ll use a sample Yelb application (app) hosted on GitHub here. This application currently uses an internal load balancer and an alias record in a private hosted zone for yelb-appserver service discovery and AWS Cloud Map for yelb-redis and yelb-db service discovery. There’s also an external load balancer used to communicate from the end user to the yelb-ui service. The following is an architectural diagram of the sample application:
For this sample migration to function properly, the following resources are created for you in Step 1:
- An Amazon VPC
- A pair of public and private subnets spread across two Availability Zones (AZs)
- An internet gateway, with a default route on the public subnets
- A pair of Network Address Translation (NAT) gateways (one in each AZ)
- Default routes for the NAT gateways in the private subnets
- AWS Identity and Access Management (AWS IAM) roles for the sample Yelb application tasks and task execution roles
- Security groups for the Yelb app service components
- Service discovery namespaces for the Yelb app components
- 1 External Load Balancer and target groups to expose the Yelb user interface (UI) app
- 1 Internal Load Balancer and target groups to expose the Yelb app server
- 1 Amazon ECS Cluster
- Amazon ECS service and task definitions deployed
For this walkthrough, you’ll need the following prerequisites:
- An AWS Account
- Access to a shell environment. This can be a shell running in an AWS Cloud9 Instance, AWS CloudShell, or locally on your system
- Your shell environment needs to have git installed and the AWS Command Line Interface (AWS CLI) configured with version 2.9.2 or higher.
- Your AWS CLI needs to have a profile configured with access to the AWS account you wish to use for this walkthrough.
- Enable the new Amazon ECS Console, if not already enabled. You may do so by toggling the radio button in the top left corner of the Amazon ECS Console.
To learn about the differences between the classic and new console experience, check out our documentation.
Step 1: Setup infrastructure and deploy sample app
Download the sample code to the computer or shell environment you’ll be using for this walkthrough. If you have not yet done so, run the following command to clone a copy of the provided GitHub Repo to your system from your terminal.
To simplify the setup experience, you’ll use an AWS CloudFormation template to provision the necessary infrastructure, service, and task definitions needed for this walkthrough.
Run the simple setup script from the shell environment of your choice to deploy the provided AWS CloudFormation template. The script accepts four optional arguments:
- AWS_PROFILE: Name of the AWS CLI profile you wish to use. If you don’t provide a value, then default will be used.
- AWS_DEFAULT_REGION: Default Region where AWS Cloud Formation Resources will be deployed. If you do not provide a value, then us-west-2 will be used.
- ENVIRONMENT_NAME: Environment Name for the Amazon ECS cluster. If you don’t provide a value, then ecs will be used.
- CLUSTER_NAME: Desired Amazon ECS Cluster Name. If you don’t provide a value, then yelb-cluster will be used.
To use the setup script with all arguments in the us-east-2 Region, you would run the following command:
If you prefer following the AWS CloudFormation deployment through the Console, then do that here.
Note: Be sure to select the correct Region in the Console for where you deployed to.
The setup script takes around 5 minutes to complete.
Note: Even after the setup script completes, it may take some additional time for every service and task to come to a RUNNING state.
Once the deployment has completed successfully, you’ll see output similar to the following:
View the sample application through the deployed elastic load balancer using the provided URL. You’ll also find this URL in the AWS CloudFormation outputs. The following is an example:
The following is an example of the sample application you just deployed:
Navigate to the Amazon ECS Console and visually verify all services and tasks are in the RUNNING state.
Note: You’ll want to ensure you are viewing the Amazon ECS Console for the Region you chose to deploy the AWS CloudFormation template.
When all tasks and services are in the RUNNING state, your Amazon ECS cluster will look like the following examples:
Step 2. Generate traffic for Internal Load Balancer
Now that you have your sample application and all required infrastructure deployed, you are ready to generate some traffic using the application endpoint. To do this, use the ./scripts/generate-traffic.sh script by running the following command:
While the script runs, watch your Amazon ECS Cluster’s services, specifically, the yelb-appserver. You may notice the tasks begin to fail due to the intense load. Below is an example of a service with failing tasks that are still in the process of self-healing:
When this happens, if you try to access the Yelb appserver API in your browser with the application URL and the path /api/getvotes i.e. http://yelb-service-connect.us-east-2.elb.amazonaws.com/api/getvotes, you may also see a 500-series error similar to the following:
These dropped requests are due to high load from the load test which caused the yelb-appserver tasks to begin to fail. As Amazon ECS self heals and spins up, new tasks will take Amazon Route 53 additional time to create the necessary Domain Name System (DNS) records, which can result in DNS propagation delays. Keep this in mind, because you’ll revisit this topic in Step 7 after you upgrade to Amazon ECS Service Connect.
Once the script completes, you’ll see a message similar to the following:
Step 3: View monitoring metrics for service discovery and Internal Load Balancer
To view the traffic you just generated, use the link provided from the script in Step 2 or use the monitoring metrics tab in the Amazon EC2 Load Balancer dashboard. Be sure to select the appropriate Region for your deployment.
Once you are in the Load Balancers console, select the serviceconnect-appserver instance, which has a DNS prefix name similar to internal-serviceconnect-appserver-xxxx. The following is an example:
From within the serviceconect-appserver page, navigate to the Monitoring tab.
From the Monitoring tab, if you adjust the time options to a 1 h period, you should see spikes similar to the following example:
Step 4: AWS Cloud Map namespaces
You are almost ready to upgrade to Amazon ECS Service Connect, but before you do, I want to point out AWS Cloud Map namespaces that were created during the AWS CloudFormation template deployment. Navigate to the AWS Cloud Map Console, where you’ll see the two namespaces that were created.
Note: If you don’t see any namespaces in the AWS Cloud Map Console, be sure to select the correct Region for your deployment.
The following is an example of what you’ll see:
One namespace is for Service Discovery and the other is for Amazon ECS Service Connect. Click on the Amazon ECS Service Connect namespace (i.e., yelb.sc.internal), and scroll down. You’ll notice there aren’t currently any services attached to it.
You’ll also find access to AWS Cloud Map Namespaces in the new Amazon ECS Console under Namespaces on the left-hand side. The following is an example:
Select the namespace for yelb.sc.internal and you’ll see there aren’t any services attached to it. Keep an eye on this namespace after you move your services to Amazon ECS Service Connect and notice the changes.
Step 5: Migrate to Amazon ECS Service Connect
Now you are ready to migrate from service discovery to Amazon ECS Service Connect. After the migration is complete, the sample application architecture will look like this:
For this migration example, you’ll use the AWS CLI to update the four services that make up this sample application.
To simplify the commands needed, use the ./scripts/use-service-connect.sh script by running the following command in the shell environment of your choice:
Once the script completes, you’ll see output similar to the following example:
Great! Now that the migration is complete. Let’s head back on over to the Amazon ECS Console and check on the Amazon ECS Service Connect namespace. You’ll now see the four yelb services attached. The following is an example:
Note: While the migration from Service Discovery to Amazon ECS Service Connect is complete, it may take some time for the Amazon ECS Services and Tasks to be in an Active or RUNNING state again.
Step 6: What changed?
Let’s break down what changed when we ran the ./scripts/use-service-connect.sh script.
In a code editor of your choice, open the ./scripts/use-service-connect.sh file. Take note of the aws ecs update-service command used at the end of the script, specifically the –service-connect-configuration flag. This flag updates the Amazon ECS services to use the new Amazon ECS Service Connect configuration.
Looking at the AWS CLI Documentation for the ecs update-service command, you can see the –service-connect-configuration flag is expecting a JSON structure.
Note: You can only use a JSON service-connect-configuration file at this time.
Cross reference that guidance with the script, you’ll notice each command starting from line 37 and on uses that flag with a JSON file referenced. The following is an example of the update service command for the yelb-db service.
The –service-connect-configuration flag is referencing a svc-db.json file located in the sc-update/ directory of the provided GitHub repo. Open the sc-update/svc-db.json file to see how line 2 has the key enabled set to a value of true. The following is an example of the same svc-db.json file:
From the sample code snippet above, pay attention to the dnsName key on line 10. Notice it’s still pointing to the load balancer’s service discovery ID. To avoid changing your applications in client Amazon ECS services, set this to the same name the client application uses by default. In this case, yelb-db.yelb.cloudmap.internal is used.
For more examples, click through the svc JSON files in the sc-update directory to see the Amazon ECS Service Connect configuration for each service.
Step 7: View monitoring metrics for Internal Load Balancer for Amazon ECS Service Connect
Once the migration is complete, navigate to the Amazon ECS Console and verify all the services and tasks are in the RUNNING state. This may take some time as the existing tasks have to be stopped and replaced with new tasks. The new tasks should appear like those in the following example:
Once all services and tasks are in the RUNNING state, go ahead and generate traffic for the application endpoint again using the ./generate-traffic.sh script and the following command:
While the load test is running, keep an eye on the services in your Amazon ECS Cluster the same as you did when you ran the load test earlier.
You should see tasks fail and try to self-heal just as they did before. The following is another example:
However, if you try to access the Yelb App Server API using the application URL + the path /api/getvotes (i.e., http://yelb-service-connect.us-east-2.elb.amazonaws.com/api/getvotes), you won’t see any 500-series errors like in Step 2. This is because of the way Amazon ECS Service Connect handles requests, which no longer relies on DNS hosted zones in Amazon Route 53; there may be increased latency, but you won’t see any dropped requests.
Navigate to the Amazon EC2 Load Balancer Console, like you did in Step 3, and choose the app server’s Internal Load Balancer again. Under the Monitoring tab, notice the app server traffic is no longer served by the Internal Load Balancer after the service migration from service discovery to Amazon ECS Service Connect. This is evident by the requests dashboard not showing any new traffic. The following is an example:
To avoid future charges, clean up the resources created in this blog post. To make it easier, we created a ./scripts/cleanup.sh script for you to use.
Run the following command:
Note: The clean up script takes around 20 – 25 minutes to complete.
Congratulations! You just learned how to migrate from service discovery to the new Amazon ECS Service Connect! To learn more about Amazon ECS Service Connect, check out the Amazon ECS Service Connect: Simplified interservice communication session from re:Invent 2022 and the Amazon ECS Service Connect documentation.