How do I troubleshoot and fix failing health checks for Application Load Balancers?

Last updated: 2019-06-19

The targets registered to my Application Load Balancer aren't healthy. How do I find out why my targets are failing health checks?

Resolution

To troubleshoot and fix failing health checks for your Application Load Balancer:

1.    Check the health of your target to find the reason code and description of your issue.

2.    Follow the resolution steps below for the error you received.

Elb.InitialHealthChecking

Description: Initial health checks in progress.

Resolution: Before a target can receive requests from the load balancer, that target must pass initial health checks. Wait for your target to pass the initial health checks, and then recheck its health status.

Elb.RegistrationInProgress

Description: Target registration is in progress.

Resolution: The load balancer starts routing requests to the target as soon as the registration process completes and the target passes the initial health checks. 

Target.DeregistrationInProgress

Description: Target deregistration is in progress.

Resolution: When you deregister a target, the load balancer waits until in-flight requests are complete. This is known as the deregistration delay. By default, Elastic Load Balancing waits 300 seconds before completing the deregistration process. However, you can customize this value.

If a deregistering target has no in-flight requests and no active connections, then Elastic Load Balancing immediately deregisters without waiting for the deregistration delay to elapse. The initial state of a deregistering target is draining. After the deregistration delay elapses, the deregistration process completes and the state of the target is unused. If the target is part of an Auto Scaling group, then it can be terminated and replaced.

Target.FailedHealthChecks

Description: The load balancer received an error while establishing a connection to the target, or the target response was malformed.

Resolution:

  • Verify that your application is running. Use the service command to check the status of services on Linux targets. For Windows targets, check the Services tab of the Windows Task Manager. If the service is stopped, start the service. If the service isn't recognized, verify that the required service is installed.
  • Verify that the target is listening for traffic on the health check port. You can use the ss command on Linux targets to verify which ports your server is listening on. For Windows targets, you can use the netstat command.
  • Verify that your application responds to the load balancer's health check requests accordingly. The following example shows a typical health check request from the Application Load Balancer that your targets must return with a valid HTTP response. The Host header value contains the private IP address of the load balancer node, followed by the health check port. The User-agent is set to ELB-HealthChecker/2.0. The line terminator for message-header fields is the sequence CRLF, and the header terminates at the first empty line followed by a CRLF. If necessary, add a default virtual host to your web server configuration to receive the health check requests.
GET / HTTP/1.1
Host: 10.0.0.1:80
Connection: close
User-Agent: ELB-HealthChecker/2.0
Accept-Encoding: gzip, compressed
  • The Target Type of your target group determines which network interface that the load balancer sends health checks to on the targets. For example, you can register instance IDs, IP addresses, and Lambda functions. If the target type is instance ID, then the load balancer sends health check requests to the primary network interface of the targets. If the target type is IP address, then the load balancer sends health check requests to the network interface associated with the corresponding IP address. If your targets have multiple interfaces attached, then verify that your application is listening on the correct network interface.
  • The ELBSecurityPolicy-2016-08 security policy is used for target connections and HTTPS health checks. Verify that the target provides a server certificate and a key in the format specified in the security policy. Also verify that the target supports one or more matching ciphers and a protocol provided by the load balancer to establish the TLS handshake.

Target.InvalidState

Description: The target is in the stopped or terminated state.

Resolution: If the target is an EC2 instance, open the Amazon EC2 console and verify that the instance is running. Start the instance if necessary.

Target.IpUnusable

Description: The IP address can't be used as a target because it's in use by a load balancer.

Resolution: When you create a target group, you specify its Target Type. When the target type is IP, don't choose an IP address that's already in use by a load balancer.

Target.NotInUse

Description: The target group isn't used by any load balancer or the target is in an Availability Zone that isn't enabled for its load balancer.

Resolution:

  • Check the target group and verify that it's configured to receive traffic from the load balancer.
  • Verify that the Availability Zone of the target is enabled for the load balancer.

Target.NotRegistered

Description: The target isn't registered to the target group.

Resolution: Verify that the target is registered to the target group.

Target.ResponseCodeMismatch

Description: The health checks didn't return an expected HTTP code.

Resolution:

  • Success codes are the HTTP codes to use when checking for a successful response from a target. You can specify values or ranges of values between 200 and 499. The default value is 200. Check your load balancer health check configuration to verify which success codes that it's expecting to receive. Then, inspect your web server access logs to see if the expected success codes are being returned. Modify the success code value if necessary.
  • Verify that the ping path is valid. The ping path is the destination on the targets for health checks. Be sure to specify a valid URI (/path?query). The default is /. Modify the ping path value if necessary.

Target.Timeout

Description: Request timed out.

Resolution: If you can connect, then the target page might not respond before the health check timeout period. Most web servers, such as nginx and IIS, let you log how long the server takes to respond. If your health check requests take longer than the configured timeout, you can:

If you can't connect:


Did this article help you?

Anything we could improve?


Need more help?