How do I resolve 504 HTTP errors in Amazon EKS?

Last updated: 2021-12-03

I get HTTP 504 (Gateway timeout) errors when I connect to a Kubernetes Service that runs in my Amazon Elastic Kubernetes Service (Amazon EKS) cluster.

Short description

You get HTTP 504 errors when you connect to a Kubernetes Service pod that's located in an Amazon EKS cluster that's configured for a load balancer.

To resolve HTTP 503 errors, see How do I resolve HTTP 503 (Service unavailable) errors when I access a Kubernetes Service in an Amazon EKS cluster?

To resolve HTTP 504 errors, complete the following troubleshooting steps.


Verify that your load balancer’s idle timeout is set correctly

The load balancer established a connection to the target, but the target didn't respond before the idle timeout period elapsed. By default, the idle timeout for the Classic Load Balancer and Application Load Balancer is 60 seconds.

1.    Review the Amazon CloudWatch metrics for your Classic Load Balancer or Application Load Balancer.

Note: If the latency data points are equal to your currently configured load balancer timeout value and there are data points in the HTTPCode_ELB_5XX metric, then at least one request has timed out.

2.    Modify the idle timeout for your load balancer so that the HTTP request can complete within the idle timeout period, or configure your application to respond quicker.

To modify the idle timeout for your Classic Load Balancer, update the service definition to include the annotation.

To modify the idle timeout for your Application Load Balancer, update the Ingress definition to include the idle_timeout.timeout_seconds annotation.

Verify that your backend instances have no backend connection errors

If a backend instance closes a TCP connection to the load balancer before the load balancer has reached its idle timeout value, then the load balancer fails to fulfill the request.

1.    Review the CloudWatch BackendConnectionErrors metrics for your Classic Load Balancer and the target group's TargetConnectionErrorCount for your Application Load Balancer.

2.    Enable keep-alive settings on your backend worker node or pods, and set the keep-alive timeout to a value greater than the load balancer’s idle timeout.

To see if the keep-alive timeout is less than the idle timeout, verify the keep-alive value in your pods or worker node. See the following example for pods and nodes.

For pods:

$ kubectl exec your-pod-name -- sysctl \


net.ipv4.tcp_keepalive_time \
    net.ipv4.tcp_keepalive_intvl \

For nodes:

$ sysctl \


net.ipv4.tcp_keepalive_time \
    net.ipv4.tcp_keepalive_intvl \

Verify that your backend targets can receive traffic from the load balancer over the ephemeral port range

The network access control list (ACL) for the subnet doesn't allow traffic from the targets to the load balancer nodes on the ephemeral ports (1024-65535).

You must configure security groups and network ACLs to allow data to move between the load balancer and the backend targets. For example, depending on the load balancer type, these targets can be IP addresses or instances.

To configure the security groups for ephemeral port access, you must connect the security group egress rule of your nodes and pods to the security group of your load balancer. For more information, see Security groups for your VPC and Add and delete rules.