How do I troubleshoot network performance issues between Amazon EC2 Windows instances in a VPC and an on-premises host over an internet gateway?
Last updated: 2019-10-07
I'm experiencing network performance issues between my Amazon Elastic Compute Cloud (Amazon EC2) Windows instances and my on-premises host over an internet gateway. How can I troubleshoot these packet loss or latency issues?
Note: Before you begin troubleshooting, identify the source and destination IP addresses. If the destination is a URL, use the nslookup command to determine the IP address. Be aware that some URLs use dynamic IP addresses, so the IP address might change. Run the command multiple times to see if the IP address is constant.
Check for ECN capability
1. Run the following command to determine if Explicit Congestion Notification (ECN) capability is enabled.
netsh interface tcp show global
2. If ECN capability is enabled, run the following command to disable it.
- netsh interface tcp set global ecncapability=disabled
3. If you don't see an improvement in performance, you can re-enable ECN capability using the following command.
netsh interface tcp set global ecncapability=enabled
Review hops and troubleshoot TCP port connectivity
First, use MTR or tracert to review hops:
1. Download and install WinMTR.
2. Enter the destination IP in the Host section, and then choose Start.
3. Let the test run for a minute, and then choose Stop.
4. Choose Copy text to clipboard and paste the output in a text file.
5. Look for any losses in the % column that are propagated to the destination.
Note: Ignore any hops with the No response from host message. This message indicates that those particular hops aren't responding to the ICMP probes.
6. Review hops on the MTR reports using a bottom-up approach. For example, check for loss on the last hop or destination, and then review the preceding hops.
If you don't want to install MTR, you can use the tracert command utility tool.
1. Perform a tracert to the destination URL or IP address.
2. Look for any hop that shows an abrupt spike in round-trip time (RTT). An abrupt spike in RTT might indicate that there's a node under high load, which in turn induces latency or packet drops in your traffic.
Then, check TCP port connectivity:
Note: Because WinMTR and tracert are both ICMP-based, you can use tracetcp to troubleshoot TCP port connectivity.
2. Extract the Tracetcp ZIP file.
3. Copy tracetcp.exe to your C drive.
4. Install WinPcap.
5. Open the command prompt and root WinPcap to your C drive using the C:\Users\username>cd \ command.
6. Run tracetcp using the following commands: tracetcp.exe hostname:port or tracetcp.exe ip:port.
Check the Windows Task Manager
If you have access to the source instance or destination instance, check the Windows Task Manager. Look for issues with CPU and memory utilization, or load average.
Take a packet capture
Note: It's a best practice to first start the packet capture and then initiate the traffic. This approach helps you capture all packets for the flow.
1. Install Wireshark and take a packet capture.
2. Use the following filter to isolate the traffic between particular sources in the packet capture: (ip.addr eq source_IP) && (tcp.flags.syn == 1). The output shows all the tcp streams initiated by that source IP.
3. Select the row with the relevant source IP and destination IP.
4. Choose the context (right-click) menu, and then choose Follow, TCP Stream. This results in a TCP flow between the source IP and destination IP that you want to investigate.
5. Look for retransmissions, duplicate packets, or TCP window size notifications like TCP window full or Window size zero. These notifications might indicate that the TCP buffers are running out of space.