How can I troubleshoot packet loss for my Direct Connect connection?
Last updated: 2021-12-23
I'm using AWS Direct Connect to transfer data. I'm experiencing packet loss transferring data to my Amazon Elastic Compute Cloud (Amazon EC2) instance. How can I isolate the packet loss?
Packet loss occurs when transmitted data packets fail to arrive at their destination resulting in network performance issues. Packet loss is caused by low signal strength at the destination, excessive system utilization, network congestion and network route misconfigurations.
Run the following checks for your network devices and Direct Connect connection.
Check the AWS Personal Health Dashboard for scheduled maintenance or events
Check metrics for the Direct Connect endpoint, customer gateway (CGW), and intermediate device (layer 1)
- The CGW logs for Interface flaps.
- CPU utilization for the CGW when the issue occurred.
- The light signal reading on the device the Direct Connect connection terminates.
- The device the Direct Connect connection terminates for input errors, incrementing framing errors, cyclic redundancy (CRC) errors, runts, giants and throttles.
Check Direct Connect connection metrics (layer 1)
- ConnectionErrorCount: Apply the sum statistic and note that non-zero values indicate MAC level errors in the AWS device.
- ConnectionLightLevelTX and ConnectionLightLevelRX: Check the light signal recorded on the Direct Connect connection when the issue occurred. The acceptable range is between -14.4 and 2.50 dBm.
- ConnectionBpsEgress and ConnectionBpsIngress: Check the amount of traffic on the Direct Connect connection when the packet loss occurred for congestion on the link.
For more information, see Direct Connect Connection metrics.
Check for asymmetric sub optimal routing (layer 3)
End-to-end bidirectional trace route between the on premises host and the AWS host (layer 3)
1. Run the following command to install traceroute:
sudo yum install traceroute
sudo apt-get install traceroute
2. Then, run a command similar to the following for the ICMP traceroute:
sudo traceroute -T -p <destination Port> <IP of destination host>
1. Download WinPcap and tracetcp.
2. Extract the Tracetcp ZIP file.
3. Copy tracetcp.exe to your C drive.
4. Install WinPcap.
5. Open the command prompt and root WinPcap to your C drive using the C:\Users\username>cd \ command.
6. Run tracetcp using the following commands: tracetcp.exe hostname:port or tracetcp.exe ip:port.
End-to-end bidirectional MTR test between the on premises host and the AWS host (layer 3)
Check the MTR results for packet loss and network latency. A network loss percentage at a hop can indicate an issue with the router. Some service providers limit the ICMP traffic that MTR uses. To determine if the packet loss is due to rate limits, review the subsequent hops. If the subsequent hop shows a loss of 0.0%, this can indicate ICMP rate limiting.
1. Run the following command to install MTR:
$ sudo yum install mtr -y
sudo apt install mtr -y
Download and install WinMTR.
Note: For Windows OS, WinMTR doesn't support TCP-based MTR.
2. For the on-premises --> AWS direction, run MTR on the on premises host (ICMP and TCP based):
$ mtr -n -c 100 <private IP of EC2> --report $ mtr -n -T -P <EC2 instance open TCP port> -c 100 <private IP of EC2> --report
3. For the AWS --> on-premises direction, run MTR on the EC2 instance (ICMP and TCP based):
$ mtr -n -c 100 <private IP of the local host> --report $ mtr -n -T -P <local host open TCP port> -c 100 <private IP of the local host> --report
Review the path MTU between the on premises host and AWS host (layer 3)
1. For the on-premises --> AWS direction, run tracepath on port 80 from the local host:
$ tracepath -n -p 80 <EC2 private instance IP>
$ tracepath -n -p 80 <private IP of local host>