How do I troubleshoot 503 errors returned while using Classic Load Balancer?
Last updated: 2022-08-25
I'm seeing HTTP 503 errors in Classic Load Balancer access logs, CloudWatch metrics, or when hitting the load balancer's DNS name in the browser or from my clients. How do I fix this?
Make sure that you registered backend instances in every Availability Zone that your Classic Load Balancer is configured to respond in. Make sure that the registered backend instances aren't failing health checks, and that they’re sized appropriately to handle the load your application requires.
To see the number of healthy backend instances behind your load balancer, check the HealthyHostCount and UnHealthyHostCount metrics in CloudWatch. If the CloudWatch metrics indicate that you have no healthy hosts, then you can troubleshoot the issue by checking the following:
Make sure that your backend instances can respond to health checks
If the backend instances are running, but the UnhealthyHostCount metric indicates that the instances aren't healthy, verify that the application can respond to health check requests. For HTTP/HTTPS health checks, make sure that your load balancer is able receive a 200 response code from the back end. For layer 4 health checks, the load balancer marks the instance as healthy if the instance successfully completes a TCP handshake. For instructions, see Troubleshoot a Classic Load Balancer: Health checks.
Make sure that your load balancer and backend instances can handle the load
Check your load balancer and backend instances to verify that they’re able to handle the CPU usage, memory, disk, and number of connections your application requires.
For example, check the SpilloverCount and SurgeQueueLength CloudWatch metrics. If SurgeQueueLength is at or near the maximum of 1,024 queued requests, or SpilloverCount is a non-zero number, then that indicates that the back end can’t serve requests as fast as they’re coming in, or isn’t able to serve requests at all.
Also check CPUUtilization CloudWatch metrics for your backend instances—if you see that the CPU utilization is spiking to 100%, or is consistently high over long periods of time, then consider adding more backend instances, or resize the current instances to larger sizes. For instructions on checking other values, such as memory and disk usage, check the instance vendor’s documentation.