One or more instances behind my Classic Load Balancer are failing health checks. What are some potential causes for this, and how can I fix it?

Instances behind a Classic Load Balancer usually fail health checks for one or more of the following reasons:

  • The security groups attached to the load balancer, network ACLs, or the instances do not allow inbound or outbound traffic on the health check port (usually port 80), or an instance-level or OS-level firewall does not allow inbound or outbound traffic on the port (for example, iptables for instances running Linux).
  • The health check target page is not configured on the instance.
  • The health check requests from your load balancer to your EC2 instance are timing out or failing intermittently due to causes such as misconfigured timeout settings, application or database level bottlenecks, or significant load on the backend instances.

To troubleshoot and fix issues with instances behind a load balancer, check the following:

Make sure security groups allow traffic on the health check port

Make sure that traffic on the port being used for health checks (usually port 80) is allowed by the security groups associated with both the load balancer and the backend instance. Ensure also that the port is allowed by any instance-level or OS-level firewall configurations (for example, iptables for EC2 instances running Linux).

Last, make sure that the application running on the EC2 instance is listening on the health check port. You might use the netstat command to verify that your application is listening.

Verify that the target page on the instances is configured correctly

If you have configured your health check to use HTTP or HTTPS as the ping protocol, use a command similar to the following to test the connection between the load balancer and registered instances:

curl -s -k -o /dev/null -v http[s]://private-IP-address-of-the-instance:port/health-check-target-page

Note: If you are running this command on a Windows instance, use “null” instead of “/dev/null”.

If this command returns a non-200 response, check that the target page is configured on the instance. If it is not configured, create a target page (for example, index.html) on each registered instance and specify its path in the ping path for the instance.

Verify that you've configured the target page on the registered instance by using the following command:

ls –alh /var/www/html

Also, make sure that any redirection rules allow health check traffic.

Make sure that your instances can respond to health check traffic within your set timeouts

If curl returns a 200 OK response, but your instance still fails health checks, make sure that the response timeout settings are acceptable for your application.

If you use an HTTP or HTTPS-based health check, overutilized resources at the application or database level might cause the target page to respond after the timeout expires. Consider using a simpler health check target page or adjusting the health check interval settings until your database or application can be scaled out correctly.

To determine if the instance is under significant load and is taking longer than your configured response timeout period to respond, consult the CloudWatch monitoring graphs for your backend instances to see if you're potentially overutilizing the CPU of your instances, as well as the utilization of other application resources, such as memory, disks, and connection limits.

If your backend instances are overutilized, consider scaling out by resizing the instances, adding more instances, or enabling Auto Scaling.

Troubleshoot intermittent health check failures related to excessive load

Ensure that your infrastructure and application are configured to handle excessive load, and that socket exhaustion isn't preventing a new TCP handshake. Also, verify the integrity of the disks and file systems of your backend instances.

404, 408, 503, 504, Classic Load Balancers, timeout

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2017-01-10