Attach multiple IPs to a NAT Gateway to scale your egress traffic pattern
AWS NAT Gateway is a highly available and horizontally scalable Network Address Translation (NAT) service. AWS NAT Gateway allows resources in a private subnet to connect to target resources outside the subnet using the NAT Gateway’s IP address. These target resources can either be in the same VPC, a different VPC, on the internet, or within your on-premises network. This means you can use a NAT Gateway to turn on egress-only connectivity for your workloads, and you only need to allow-list the NAT Gateway’s IP address. See the AWS documentation for more NAT Gateway use-cases.
Last year, we increased the auto scaling capacity of NAT Gateways to 100 Gbps and 10 million packets per second (pps). We have also just increased NAT Gateway’s capacity to support up to 440,000 concurrent connections to a unique destination, which is eight times the previous 55,000 limit. A unique destination is recognized by a unique combination of destination IP address, destination port, and protocol. If any of these parameters change, then it’s counted as a new unique destination. AWS NAT Gateway can now establish up to 440,000 concurrent connections to a unique destination. You can take advantage of this feature by either creating a NAT Gateway with multiple IP addresses, or by associating secondary IP addresses with an existing NAT Gateway. In this post, we explain how you can use this new feature on your NAT Gateways.
Traffic flows and AWS NAT Gateways
First, let’s define what a traffic flow is and why it matters for the NAT Gateway. We assume you’re familiar with the TCP/IP protocol stack and header encapsulation mechanisms. A network flow, or packet flow, is generally defined by a source IP, and Port, destination IP, and Port, and protocol, for example TCP, UDP, or ICMP. You have two options when performing NAT: 1:1 NAT, and Port Address Translation (PAT). AWS NAT Gateways performs Port Address Translation. AWS NAT Gateways performs source-NAT, and translates the IP address of the source with its private IP address. Then, the Internet Gateway (IGW) translates the private IP address of the NAT Gateway to the Elastic IP address associated with the NAT Gateway. In Figure 1 below, we show the packet walk for two flows from an Amazon Elastic Compute Cloud (Amazon EC2) instance deployed in a private subnet communicating with a public server (192.0.2.29) on TCP port 389.
AWS NAT Gateways can be deployed in a distributed model, where each VPC has its own NAT Gateways, or centralized, to direct all egress traffic flows through a central egress VPC. In some of these cases, both for centralized and distributed architectures, we see deployments where traffic from internal workloads is concentrated on a fixed set of destination servers. This often results in the NAT Gateway translating multiple flows to a unique destination (IP, Port, Protocol). For high-scale use cases (for example, authentication/identity servers) this can lead to over 55,000 simultaneous single-destination connections.
Figure 2 below shows a sample deployment, with multiple workloads communicating with a public server on TCP port 389. The flows shown in Figure 2 have the same destination IP, destination port, and protocol. The source IP address of the flows, as seen at the public server, differs depending on the source instance’s Availability Zone (AZ). All flows originating in AZ-a have NAT-GW-AZ-a’s elastic IP as the source, and all flows originating in AZ-b have NAT-GW-AZ-b’s elastic IP as the source.
Before this launch, if your deployment required over 55,000 simultaneous single-destination connections, then you would’ve needed multiple NAT Gateways and corresponding routing rules to achieve the desired scale. This meant separating your workloads into different subnets, and distributing traffic across multiple NAT Gateways.
To simplify your deployment, you can now assign up to eight Elastic IP addresses per NAT Gateway—an eight-fold increase in NAT Gateway’s scaling capabilities. Therefore, NAT Gateway now allows up to 440,000 simultaneous single-destination connections, as each associated IP address increases the limit linearly by 55,000 connections.
Using NAT Gateways with multiple Elastic IPs
Throughout this blog post, we use the sample network topology shown in Figure 3 below to highlight the new functionality of AWS NAT Gateway. We’re using the distributed deployment for NAT Gateways, so our workloads and the NAT Gateway are in the same VPC. We show the steps to add multiple AWS-assigned Elastic IP addresses to an existing NAT Gateway, but you can use this feature with BYOIP addresses, private IP addresses with the NAT Gateway, or a centralized NAT Gateway architecture. In this topology, the EC2 instances deployed in an autoscaling group constantly communicate with a test public server on TCP port 389. Traffic flows through the NAT Gateway and the IGW as shown in Figure 3 below.
Initially, the NAT Gateway we deployed has a single Elastic IP address associated with it. This means the NAT Gateway can support up to 55,000 simultaneous connections to the public server on TCP port 389. You will see port allocation errors on the NAT Gateway if the EC2 instances try to open over 55,000 connections to the unique destination represented by the public server receiving connections on TCP port 389. This is more likely to occur as the Auto Scaling group scales out and the number of instances increases. You can monitor the port allocation errors by using the ErrorPortAllocation Amazon CloudWatch metric. See the Monitor NAT gateways with Amazon CloudWatch entry in our documentation for more details.
Let’s now associate multiple Elastic IPs with an existing NAT Gateway and some important considerations:
Step-1 (optional): Allocate an Elastic IPv4 address. This step is optional, and you can use an existing Elastic IP address as long as it isn’t associated with another resource. See the Allocate an Elastic IP address page in our documentation for instructions on how to allocate a new Elastic IP address.
Step-2: Navigate to your NAT Gateway and select the “Secondary IP addresses” tab. All secondary Elastic IPs, or BYOIPs, that you associate with the NAT Gateway are shown in Figure 4 below. Select “Edit secondary IPv4 address associations.”
Step-3: Select the additional Elastic IP address that you’d like to associate with the NAT Gateway, and then select “Save changes.”
Now the new Elastic IP address is ready, with an additional 55,000 simultaneous single-destination connections from the NAT Gateway. When you associate a secondary Elastic IP with a NAT Gateway, a private IPv4 address is automatically selected from the NAT Gateway’s subnet. This is shown in figure 6 below:
The NAT Gateway uses a flow hash mechanism to select one associated IP address for a given flow. The fields included in this hash are the Elastic Network Interface ID of the source of traffic, the source and destination IP addresses, the source and destination ports, and the protocol. For TCP traffic, the TCP sequence number is also included in the hash calculation. Since the source and destination port numbers are included in the hashing algorithm, you may see different source IPs for applications like passive FTP that change the port numbers. For workloads that need outbound connections to these applications from a single source IP address, we recommend using NAT Gateways with a single associated Elastic IP address.
To test NAT Gateway’s flow hashing across multiple Elastic IPs, we used iPerf. Figure 7 below shows a snippet taken from the test public server’s console. You can see the NAT Gateway’s primary Elastic IP address (54.17x.yy.zz), and the secondary Elastic IP address (52.9.aa.bb).
Consider the following when you associate multiple IP addresses with a NAT Gateway:
- Each secondary Elastic IP or BYOIP associated with an NAT Gateway automatically assigns an additional private IPv4 address to the NAT Gateway. Make sure that the NAT Gateway’s subnet has sufficient IP addresses available.
- If you’re using VPC Flow logs for the NAT Gateway’s Elastic Network Interface (ENI), then you’ll continue to see the NAT Gateway’s primary private IPv4 addresses. This is because a separate ENI isn’t created when you associate a secondary Elastic IP address with a NAT Gateway. You can use the pkt-srcaddr and pkt-dstaddr fields within VPC flow logs to identify the clients’ and server’s original IP addresses in the packet flow. See our documentation on Logging IP traffic using VPC Flow Logs for details.
- You can disassociate any secondary Elastic IPs associated with a NAT Gateway. At the time of disassociating a secondary IP from the NAT Gateway, you can provide the Connection Drain duration. This is the time the NAT Gateway waits before releasing the secondary IP. The default Connection Drain value is 350 seconds. See the NAT Gateway documentation for details.
- The NAT Gateway uses a flow hash to select one of the associated IP addresses for a given flow. The source and destination IP addresses of the flow are included in the hash. For applications, like passive FTP, that change port numbers during the lifetime of the flow, we recommend using NAT Gateways with a single Elastic IP address association.
- By default, you can associate two public IP addresses with a NAT Gateway. This is a soft limit, and you can get this quota increased by reaching out to AWS Support.
In this post, you learned how to associate/disassociate additional Elastic IP addresses with a NAT Gateway. This feature is available today, so try it out and let us know if you have questions about the features or this post by contacting AWS Support.