How can I get my Amazon ECS tasks running using the Amazon EC2 launch type to pass the Application Load Balancer health check in Amazon ECS?

11 minute read

An Application Load Balancer health check for an Amazon Elastic Compute Cloud (Amazon EC2) instance in Amazon Elastic Container Service (Amazon ECS) returns an unhealthy status. I want my EC2 instance to pass the health check.

Short description

When your Amazon ECS task fails the load balancer health check, you receive one of the following errors from your Amazon ECS service event message:

"(service AWS-service) (port 8080) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed with these codes: [502 or 504]) or (request timeout)"
"(service AWS-Service) (port 8080) is unhealthy in target-group tf-20190411170 due to (reason Health checks failed)"
"(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

You might also receive the following error from your Amazon ECS task console:

"Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789)"

If you get the error "(service AWS-Service) (task c13b4cb40f1f4fe4a2971f76ae5a47ad) failed container health checks," then see How do I troubleshoot the container health check failures for Amazon ECS tasks?

Note: An Amazon ECS task can return the unhealthy status for many reasons. If the following steps don't resolve your issue, then see Troubleshooting service load balancers. To find out why your ECS task was stopped, see Checking stopped tasks for errors.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

To troubleshoot your load balancer health check issues on your Amazon ECS task and pass the Application Load Balancer health check, check the following:

Connectivity between your load balancer and Amazon ECS task
Health check settings of your target group
Status and configuration of the application in your ECS container
Status of the container instance

Check the connectivity between your load balancer and Amazon ECS task

To make sure that your load balancer is allowed to perform health checks on your Amazon ECS tasks, review the following information.

The security groups attached to your load balancer and container instance or the ECS task elastic network interface for awsvpc network mode are configured correctly

It's a best practice to configure different security groups for your load balancer and container instance or task elastic network interface. With this approach, you allow all traffic between your load balancers and container instances or task elastic network interface. You can also enable your container instances to accept traffic on the port that's specified for the task.

Confirm that the security group associated with your load balancer allows egress traffic to your container instances or task elastic network interface on the registered port. Confirm that the same is true for the health check port associated with your container instance, if applicable.
Confirm that the security group associated with your container instance or task elastic network interface allows all ingress traffic on the task host port range from the security group associated with your load balancer. To check the security group associated with your load balancer, see Security groups for your Application Load Balancer.

Important: When you use dynamic port mapping, the service is exposed on the dynamic port (typically ports 32768 - 65535) rather than on the host port. In this case, confirm that your container instance security group reflects the ephemeral port range in the ingress rules for the load balancer as a source.

Your load balancer is configured in the same Availability Zone as your container instance or ECS task elastic network interface for awsvpc network mode

When you enable an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. If you register targets in an Availability Zone, but don't turn on the Availability Zone, then these registered targets don't receive traffic. For more information, see Availability Zones and load balancer nodes.

To find out the Availability Zones that your load balancer is configured for, complete the following steps:

Open the Amazon EC2 console.
In the navigation pane, under Load Balancing, choose Load Balancers.
Select the load balancer that you're using for your Amazon ECS service.
On the Description tab, you can view the Availability Zones under the Availability Zones field.

Note: For an Application Load Balancer, you can enable or disable the Availability Zones at any time. For a Network Load Balancer, you can't disable the Availability Zones after you enable it, but you can enable additional Availability Zones.

If you use Application Load Balancers, then cross-zone load balancing is always turned on. If you use Network Load Balancers, then cross-zone load balancing is turned off, by default. After you create the Network Load Balancer, you can turn cross-zone load balancing on or off at any time. For more information, see How Elastic Load Balancing works.

To find out the Availability Zones that your container instances are configured for, complete the following steps:

Open the Amazon EC2 console.
In the navigation pane, under Auto Scaling, choose Auto Scaling Groups.
Select the container instance Auto Scaling group that's associated to your cluster.
On the Details tab, under Network, verify that the Availability Zones listed match the Availability Zones listed for your load balancer.

To modify the Availability Zones of your cluster, open the AWS CloudFormation console, choose the CloudFormation stack for your cluster, and update the subnets configuration. To find out the Availability Zones that your task elastic network interface for awsvpc is configured for, complete the following steps:

Open the Amazon ECS console.
In the navigation pane, choose Clusters, and then select the cluster that contains your service.
On the Services tab of your cluster's page, in the Service Name column, select the service that you want to check.
Choose Details, and then choose Allowed subnets to view the subnets that are enabled for the service.
You can view the subnets in the Amazon VPC console.
Verify that the Availability Zones of the subnets match the Availability Zones listed for your load balancer.

Note: You can't change the subnet configuration of an Amazon ECS service from the Amazon ECS console. You can use the AWS CLI update-service command to change the subnet configuration.

The network access control list (ACL) associated with the subnets of your load balancer and ECS container instances or ECS task elastic network interface for awsvpc network mode are configured correctly

The subnets for your load balancer and your container instance or task elastic network interface might be different. To make sure that traffic is allowed between these subnets, check the following:

Be sure that the network ACL associated with the subnets for your load balancer allows ingress traffic on the ephemeral ports (1024 - 65535) and listener port. Verify that the network ACL also allows egress traffic on the health check and ephemeral ports.
Be sure that the network ACL associated with the subnets for your container instance or task elastic network interface for awsvpc mode allows ingress traffic on the health check port. Verify that the network ACL allows egress traffic on the ephemeral ports.

For more information about network ACLs, see Work with network ACLs.

Check the health check settings of your target group

To be sure that the health check settings for your target group are configured correctly, complete the following steps:

Open the Amazon EC2 console.
In the navigation pane, under Load Balancing, choose Target Groups.
Select your target group.
Important: Use a new target group. Avoid adding targets to the target group manually, because Amazon ECS automatically registers and de-registers containers with the target group.
On the Health checks tab, enter the following information:
Check that the Port and Path fields are configured correctly. If the Port field isn't configured correctly, then your load balancer might de-register the container.
For Port, choose traffic port.
Note: If you choose Override, then confirm that the port specified matches the task host port.
For Timeout, be sure that the response timeout value is set correctly.
Note: The response timeout is the amount of time that your container has to return a response to the health check ping. If this value is lower than the amount of time required for a response, then the health check fails.

Check the status and configuration of the application in your ECS container

Confirm that the application in your ECS container responds to your load balancer health check

To make sure that the application in your ECS container responds to your load balancer health check correctly, complete the following tasks:

Check that the ping port and the health check path for your target group are configured correctly.
Monitor the CPU and memory utilization metrics for the ECS service. For example, high CPU can make your application unresponsive and cause a 502 error or timeout.
Define a minimum health check grace period. This setting instructs the service scheduler to ignore the Elastic Load Balancing health checks for a pre-defined time period after a task has been instantiated. Your Amazon ECS task might require a longer health check grace period for registering the Network Load Balancer.
Check your application logs for application errors. For more information, see Viewing awslogs container logs in CloudWatch Logs.

Confirm that the application in your ECS container returns the correct response code

When the load balancer sends an HTTP GET request to the health check path, the application in your ECS container is expected to return the default 200 OK response code.

Note: If you use an Application Load Balancer, you can update the Matcher setting to a response code other than 200. For more information, see Health checks for your target groups.

Use SSH to connect to your container instance.
(Optional) Install curl with the command appropriate for your system.
For Amazon Linux and other RPM-based distributions, run the following command:
```
sudo yum -y install curl
```
For Debian-based systems (such as Ubuntu), run the following command:
```
sudo apt-get install curl
```
To get the container ID, run the following command:
```
docker ps
```
Note: The port for the local listener is displayed in the command output under PORTS at the end of the sequence.
To get the IP address of the container, run the docker inspect command:
```
$ IPADDR=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' 112233445566)
```
Note: The IP address of the container is saved in IPADDR. Use this command only if you use the BRIDGE network mode. Replace 112233445566 with the ID number of the container.
If you use awsvpc network mode, then use the task IP address assigned to the task elastic network interface. If you use the HOST network mode, then use the IP address of the host that the task is exposed through.
To get the status code, run a curl command that includes IPADDR and the port of the local listener. For example, if you run the curl command on a container listening on port 8080 with the health check path of /health, the command must return the response code 200 OK:
```
curl -I http://${IPADDR}:8080/health
```
If you receive a non-HTTP error message, then your application isn't listening to the HTTP traffic. If you receive an HTTP status code that's different from what you specified in the Matcher setting, then your application is listening to the HTTP traffic, but not returning a status code for a healthy target.

Check the status of your container instance

Suppose that you get the following event message from your AWS ECS service event:

"(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

Check the status of your container instance by viewing the status check on the Amazon EC2 console. If your instance fails the system status checks, then try stopping and starting your instance.