How can I prevent HTTP 504 gateway timeout errors in Amazon Elasticsearch Service?

Last updated: 2019-03-08

How can I resolve 504 gateway timeout errors in Amazon Elasticsearch Service (Amazon ES)?

Short Description

A load balancer sits in front of each Amazon ES domain. The load balancer distributes incoming traffic to the data nodes. If Amazon ES requests don't complete and return a successful or unsuccessful confirmation within the idle timeout period, the load balancer closes the TCP connection to the cluster. This usually results in a 504 gateway timeout error. A 504 error does not necessarily indicate a problem with the cluster—it simply means that the request couldn't be completed within the idle timeout period.

Gateway timeout errors usually occur when you send too many requests at the same time, or when you send complex requests. In both cases, the result is the same: Amazon ES can't complete the request within the idle timeout period.

Resolution

Use one or more of the following methods to resolve HTTP 504 gateway timeout errors:

  • Enable slow logs for your Elasticsearch index, and then specify logging thresholds. Slow logs can help you determine if a particular query is taking a long time to complete. If so, tune the query to resolve the 504 error. For more information, see Viewing Amazon Elasticsearch Service Slow Logs.
  • Reduce the amount of data that Amazon ES must query for the requests. This reduces the amount of time required for the requests to complete.
  • Switch to a larger instance type. For more information, see Choosing Instance Types and Testing.
  • Configure exponential backoff and retry mechanisms in your application so that requests that time out are sent again.
  • Use bulk requests instead of individual requests. This reduces the per-request overhead for the cluster.