How do I troubleshoot packet loss on my AWS VPN connection?

Last updated: 2021-05-19

I'm having constant or intermittent packet loss and high latency issues with my AWS Virtual Private Network (AWS VPN) connection. What tests can I run to be sure that the issue isn't occurring inside my Amazon Virtual Private Cloud (Amazon VPC)?

Short description

Packet loss issues vary with AWS VPN Internet traffic hops between the on-premises network and the Amazon VPC. It's a best practice to isolate and confirm where the packet loss is coming from.

Resolution

Check the source and destination hosts for resource utilization issues such as CPUUtilization, NetworkIn/NetworkOut, NetworkPacketsIn/NetworkPacketsOut to verify that you aren't hitting network limits.

Use MTR to check for ICMP or TCP packet loss and latency

MTR provides a continuous updated output that allows you to analyze network performance over time. It combines the functionality of traceroute and ping in a single network diagnostic tool.

Install the MTR network tool on your EC2 instance in the VPC to check for ICMP or TCP packet loss and latency.

Amazon Linux:

sudo yum install mtr

Ubuntu:

sudo apt-get install mtr

Windows:

Download and install WinMTR.

Note: For Windows OS, WinMTR doesn't support TCP-based MTR.

Run the following tests between the private and public IP address for your EC2 instances and on-premises host bi-directionally. The path between nodes on a TCP/IP network can change when the direction is reversed. It's a best practice to get MTR results bi-directionally.

Note:

  • Make sure that the security group and NACL rules allow ICMP traffic from the source instance.
  • Make sure that the test port is open on the destination instance, and the security group and NACL rules allow traffic from the source on the protocol and port.

The TCP-based result determine if there is application-based packet loss or latency on the connection. MTR version 0.85 and higher have the TCP option.

Private IP EC2 instance on-premises host report:

mtr -n -c 200

Private IP EC2 instance on-premises host report:

mtr -n -T -c 200 -P 443 -m 60

Public IP EC2 instance on-premises host report:

mtr -n -c 200

Public IP EC2 instance on-premises host report:

mtr -n -T -c 200 -P 443 -m 60

Use traceroute to determine latency or routing issues

The Linux traceroute utility identifies the path taken from a client node to the destination node. The utility records the time in milliseconds for each router to respond to the request. The traceroute utility also calculates the amount of time each hop takes before reaching its destination.

To install traceroute, run the following commands:

Amazon Linux:

sudo yum install traceroute

Ubuntu:

sudo apt-get install traceroute

Private IP address of EC2 instance and on-premises host test:

Amazon Linux:

sudo traceroute
sudo traceroute -T -p 80

Windows:

tracert
tracetcp

Note: The arguments -T -p 80 -n perform a TCP-based trace on port 80. Be sure that you have port 80 or the port that you are testing open bi-directionally.

The Linux traceroute option to specify a TCP-based trace instead of ICMP is useful because most internet devices deprioritize ICMP-based trace requests. A few timed-out requests are common, so watch for packet loss to the destination or in the last hop of the route. Packet loss over several hops might indicate an issue.

Note: It's a best practice to run the traceroute command bi-directionally from the client to the server and then from the server back to the client.

Use hping3 to determine end-to-end TCP packet loss and latency problems

Hping3 is a command-line oriented TCP/IP packet assembler and analyzer that measures end-to-end packet loss and latency over a TCP connection. In addition to ICMP echo requests, hping3 supports TCP, UDP, and RAW-IP protocols. Hping3 also includes a traceroute mode that can send files between a covered channel. Hping3 is designed to scan hosts, assist with penetration testing, test intrusion detection systems, and send files between hosts.

MTRs and traceroute capture per-hop latency. However, hping3 yields results that show end-to-end min/avg/max latency over TCP in addition to packet loss. To install hping3, run the following commands:

Amazon Linux:

sudo yum --enablerepo=epel install hping3

Ubuntu:

sudo apt-get install hping3

Run the following commands:

hping3 -S -c 50 -V <Public IP of EC2 instance or on-premises host>
hping3 -S -c 50 -V <Private IP of EC2 instance or on-premises host>

Note: By default, hping3 sends TCP headers to the target host's port 0 with a winsize of 64 without the tcp flag on.

Packet capture samples using tcpdump or Wireshark

Performing simultaneous packet captures between your test EC2 instance in the VPC and your on-premises host when duplicating the issue helps to determine if there are any application or network layer issues on the VPN connection. You can install tcpdump on your Linux instance or Wireshark on a Windows instance to perform packet captures.

Install tcpdump on Amazon Linux:

sudo yum install tcpdump

Install tcpdump on Ubuntu:

sudo apt-get install tcpdump

Install Wireshark on Windows OS:

Install Wireshark and take a packet capture.

Explicit Congestion Notification (ECN)

For connecting to Windows instances, enabling ECN might cause packet losses or performance issues. Disable ECN to improve performance.

Run the following command to determine if ECN capability is enabled:

netsh interface tcp show global

If ECN capability is enabled, run the following command to disable it:

netsh interface tcp set global ecncapability=disabled