How do I resolve the "Network Error communicating with endpoint" error in API Gateway?
Last updated: 2022-08-17
I want to resolve the "Network Error communicating with endpoint" error in Amazon API Gateway.
If the number of API requests is significantly greater than the number of errors that you receive, then you're likely experiencing transient network issues. To resolve these issues, follow the steps in the Resolve low-frequency network errors section.
If you're frequently or continuously experiencing errors, then follow the steps in the Resolve high-frequency network errors section
Resolve low-frequency network errors
- Use retries with exponential backoff for failed requests.
- To activate access logging, see setting up Amazon CloudWatch API logging. Then, view the API Gateway log events in the CloudWatch console. For debugging, analyze the context variables $context.integration.error, $context.integration.latency, and $context.integrationStatus.
Resolve high-frequency network errors
Set up Amazon CloudWatch logging. Be sure to choose the Log full requests/responses data option. This option allows you to log full API requests and responses so that you can troubleshoot errors.
Consider the following resolutions:
- If your load balancer has multiple target groups, then use cross-zone load balancing to reduce latency. You can reduce latency by distributing incoming traffic evenly across all activated Availability Zones and preventing requests from being routed to Availability Zones without targets.
- Confirm that there are registered healthy instances in all your activated Availability Zones that use a Network Load Balancer and Application Load Balancer.
Note: Your load balancer is most effective when each activated Availability Zone has at least one registered target. Your Availability Zone must have at least one healthy instance per target group. This healthy instance must reach healthy status in a Network Load Balancer or Application Load Balancer.
- To avoid exceeding the integration timeout quota of API Gateway, confirm that your target group instances serve a response to the API within 29 seconds.
- Activate access logging on the Network Load Balancer and the Application Load Balancer only if you have a TLS listener.
- If you're using a Network Load Balancer, confirm the IP addresses that can reach the instance in your Amazon Elastic Compute Cloud (Amazon EC2) security groups. Your IPs must allow traffic either from all sources or from the private IP address of the Network Load Balancer.
- If you're using an Application Load Balancer, confirm that the security group for your Application Load Balancer allows traffic from all sources.
Note: Target instances can restrict access to only the Application Load Balancer. For stricter security, you can limit access from API Gateway IP addresses reserved for the AWS Region where the API is located. To receive a notification whenever the IP range list changes, subscribe to AWS IP address range notifications.
- Activate Amazon Virtual Private Cloud (Amazon VPC) Flow Logs. Then, capture the traffic information going to and from network interfaces for the Network Load Balancer and Application Load Balancer.
- If a Network Load Balancer is attached to the Amazon VPC link, check the TCP_Target_Reset_Count metric. A spike in this metric indicates that the target instances might not be closing connections to the Network Load Balancer.