How do I resolve an HTTP 503 Service Unavailable error in Amazon Elasticsearch Service?
Last updated: 2020-11-10
When I query my Amazon Elasticsearch Service (Amazon ES) domain, I get an HTTP 503 Service Unavailable error. How do I resolve this error?
A load balancer sits in front of each Amazon ES domain. The load balancer distributes incoming traffic to the data nodes. An HTTP 503 error indicates that one or more data nodes in the cluster is overloaded. When a node is overloaded by expensive queries or incoming traffic, it doesn't have enough capacity to handle any other incoming requests.
Note: You can use the RequestCount metric in Amazon CloudWatch to track HTTP response codes.
Use one of the following methods to resolve HTTP 503 errors:
Provision more compute resources
- Scale up your domain by switching to larger instances, or scale out by adding more nodes to the cluster. For more information, see Creating and managing Amazon Elasticsearch Service domains.
- Confirm that you are using an instance type that is appropriate for your use case. For more information, see Choosing instance types and testing.
Reduce the resource utilization for your queries
- Confirm that you are following best practices for shard and cluster architecture. A poorly designed cluster can't use all available resources. Some nodes might be overloaded while other nodes sit idle. Elasticsearch can't fetch documents from overloaded nodes. For more information about shard and cluster best practices, see Get started with Amazon Elasticsearch Service: How many shards do I need?
- Reduce the number of concurrent requests to the domain.
- Reduce the scope of your query. For example, if you run a query for a specific time frame, reduce the date range. You can also filter the results by configuring the index pattern in Kibana.
- Avoid running select * queries on large indices. Instead, use filters to query a part of the index and search as few fields as possible.
- Re-index and reduce the number of shards. The more shards you have in your Elasticsearch cluster, the more likely it will result in a courier fetch error. Because each shard has its own resource allocation and overheads, a large number of shards can strain your Elasticsearch cluster. To lower your shard count, see Why is my Amazon Elasticsearch Service (Amazon ES) domain stuck in the "Processing" state?