I'm using a NAT instance so that instances in a private VPC subnet can connect to the internet, but the instances have intermittent connection issues. How can I fix this?
The connection problems might be related to these issues:
- Instance operating system-level connection limits
- Network access control list (network ACL) rules
- Network issues
Instance operating system-level connection limits
Check if the NAT instance and the instances in the private subnet have reached their operating system-level connection limits. To get the number of active connections, run the netstat command:
netstat -ano | grep ESTABLISHED | wc –l netstat -ano | grep TIME_WAIT | wc –l
netstat -ano | find /i "estab" /c netstat -ano | find /i "TIME_WAIT" /c
If the command returns a value near the allowed local port range (source port for client connections), then you might be running into port exhaustion. To reduce port exhaustion, try one of these solutions:
- Increase the operating system local (ephemeral) port range by running this command:
net.ipv4.ip_local_port_range = 1025 61000
- Add ephemeral ports for new connections by allocating more elastic IPs to the NAT instance, or by increasing the number of NAT instances for internet-bound traffic.
- Resolve any application-level issues that drain the available connections.
Network ACL rules
Confirm that the network ACL allows inbound traffic from the ephemeral port range (1024-65535). If the network ACL allows only a subset of the ephemeral port range, and the instances in the private subnet use a source port outside of that range, then traffic is dropped.
Note: If you're using a NAT gateway instead of a NAT instance, use the CloudWatch ErrorPortAllocation metric to verify if source ports are exhausted. For more information on this metric, see Amazon VPC NAT Gateway Metrics and Dimensions.
Connectivity issues can be related to network problems, such as packet loss or destination host issues. For troubleshooting steps, see How do I troubleshoot network performance issues between Amazon EC2 Linux instances in a VPC and an on-premises host over the internet gateway?
Note: Packet captures on the source, destination, and NAT instance can provide more information on network issues.