Networking & Content Delivery

Best practices and considerations to migrate from VPC Peering to AWS Transit Gateway

This post presents recommendations and best practices when migrating your existing VPCs from Amazon Virtual Private Cloud (VPC) Peering to AWS Transit Gateway. It includes a migration walkthrough and considerations that you can address to improve your odds of a seamless migration. This post also details common networking testing and bench-marking tools such as iPerf to highlight possible network effects and metrics you should monitor when performing a migration.

Background

VPC Peering was launched in 2014 and lets you route traffic between disparate networks on AWS, using private IPv4 addresses or IPv6 addresses, and scale traffic in a multi-VPC environment. VPC connections can be in the same account, across multiple accounts, and even in different AWS Regions (also known as an inter-Region VPC peering connection). As you continue to add VPCs and expand your AWS footprint, managing point-to-point connectivity across many VPCs, without centrally managing connectivity and routing policies, can be operationally challenging. This operational challenge becomes more complex as the number of VPCs scales to hundreds. If your VPC routing becomes too complex, you will exceed active peering connection quotas.

We launched Transit Gateway in 2018 to enable customers to overcome this challenge. It can connect on-premises networks and thousands of VPCs using a single gateway. This hub and spoke model significantly simplifies management and reduces operational complexity.

Architecture overview

Figure 1 shows an example architecture comprising three VPC workloads in the two AWS Regions, shown as us-west-2 and us-east-1, to represent your current AWS environment. Intra-Region VPC Peering connects VPCs A and B in us-west-2. VPC A must also communicate with VPC C in us-east-1, which is set up through inter-Region VPC peering.

To help you create a scalable network architecture for long-term growth, we will outline how you can migrate from VPC peering to Transit Gateway. Transit Gateway is a regional construct and thus set up in both AWS Regions to facilitate the migration. For simplicity, we represented the workloads in this architecture using instances. However, in reality, they can be comprised of containers and other AWS services.

Figure 1: Multi-VPC AWS environment

Figure 1: Multi-VPC AWS environment

Figure 2 shows the existing VPC peering connection in the Region that facilitates the VPC-to-VPC communication between each instance in VPC A and VPC B. Each VPC has a routing table created that directs traffic to the corresponding peered VPC when the destination IP address matches the route table prefix.

Figure 2: Routing table setup for VPC Peering

Figure 2: Routing table setup for VPC Peering

When you want to perform traffic migration from VPC peering to Transit Gateway, the following three common scenarios can come into play:

  1. Migration of live traffic from Subnet 1-VPC A to Subnet3-VPC B in the same Region seamlessly.
  2. Migration of live traffic from Subnet 2-VPC A to Subnet 4-VPC B in the same Region, which can cause packet drops and requires re-establishing connections.
  1. Migration of traffic from Subnet 1-VPCA to Subnet 5-VPC C in two different Regions

Approach to migration

Before proceeding to the steps necessary to migrate the VPCs from a peered connection to Transit Gateway, you can first install and configure iPerf3 on your EC2 instances. iPerf3 simulates an ongoing connection between instances during the migration process. iPerf3 is an open-source tool that lets you do network performance measurement and tuning. Although iPerf3 is used in this experiment for the simple traffic flow tests, you can also use iPerf2 to run the same tests. iPerf2 provides multi-threading support and is better suited for high throughput performance testing, as mentioned in this AWS Knowledge Center article. You can use several flags during these scenarios to customize the iPerf3 test, and more details can be found in the iPerf documentation. To have visibility of which path is taken by forward and return traffic, we use VPC reachability Analyzer for intra-Region VPC-VPC traffic, and AWS Network Manager Route Analyzer to validate the setup for inter-Region Transit Gateway-to-Transit Gateway traffic.

There are a few Maximum Transmission Unit (MTU) considerations that you should familiarize yourself with before migrating to Transit Gateway. These will come up several times while we walk you through migration scenarios. Refer to the Transit Gateway MTU section in the Quotas documentation tab for details:

  • When migrating from VPC peering to Transit Gateway, the difference in enforced maximum MTU size between VPC Peering and the Transit Gateway may result in some asymmetric traffic being dropped. Therefore, make sure that your applications in both VPCs are communicating with packets smaller than 8500-byte MTU before updating your VPCs to avoid downtime.
  • The Transit Gateway doesn’t generate the FRAG_NEEDED for ICMPv4 packet, or the Packet Too Big (PTB) for ICMPv6 packet. Therefore, the Path MTU Discovery (PMTUD) isn’t supported.
  • The Transit Gateway enforces Maximum Segment Size (MSS) clamping for all packets. For more information, see RFC879.

Migration scenarios

Scenario 1: Migration of live traffic from Subnet 1 VPC-A to Subnet 3 VPC-B in the same Region (us-west-2)

In the first scenario, we will explore migrating traffic to a Transit Gateway using route table updates. You can do this using more specific prefixes to direct traffic from VPC A-Subnet 1 to VPC B-Subnet 3 over to the Transit Gateway while maintaining some traffic over the VPC peering connection. This lets you shift only a subset of your inter-VPC traffic for validation before shifting the rest.

Migration Steps:

    1. To validate that your initial architecture works as expected, run a simple test over the VPC peering connection, as shown in Figure 1. Begin the iPerf3 test with instance-1 in the Subnet 1-VPCA (10.1.11.0/24) acting as a server (receiver) for the connection, and instance-3 in Subnet 3-VPC B (10.2.11.0/24) subnet acting as the client (sender).On the server side, run the command (on Linux platform):
      > sudo iperf3 -s
    2. On the client side, run the command:
      > sudo iperf3 -c <server ip> -V
    3. As you can see circled in red, iPerf3 is automatically discovering that the peering connection supports an MTU of 9001 bytes. Run a continuous iPerf traffic for the duration of this scenario to simulate continuous live traffic during the migration.

      Figure 3: iPerf3 test over VPC Peering connection

      Figure 3: iPerf3 test over VPC Peering connection

    4. Set the maximum segment size to be below the MTU supported by Transit Gateway, i.e., 8500 bytes, using the -M 8448 flag. Run the iperf3 command on the client instance 3 as follows:
      > sudo iperf3 -c <server ip> -V -M 8448 -t 180
    5. Begin the initial migration of these subnets to the Transit Gateway via more specific route prefixes in the routing table. These routing table updates are shown in the following figure. Because this /24 is a more specific prefix than the /16 entry for the peering connection, traffic is directed over the Transit Gateway.

      Figure 4: Routing Table updates for moving inter-subnet traffic to Transit Gateway

      Figure 4: Routing Table updates for moving inter-subnet traffic to Transit Gateway

    6. You can now observe the iPerf3 execution concludes, and then confirm that traffic wasn’t interrupted during the route table update.

      Figure 5: iPerf traffic uninterrupted by the routing change

      Figure 5: iPerf traffic uninterrupted by the routing change

    7. You can also validate that, due to the specific routes we have placed in our VPC route tables, traffic between instance-1 in the Subnet 1-VPCA (10.1.11.0/24) and instance-3 in Subnet 3-VPC B (10.2.11.0/24) is directed over the Transit Gateway connection while other traffic between instance-2 in the Subnet 2-VPCA (10.1.12.0/24) and instance-4 in Subnet 4-VPC B (10.2.12.0/24) subnet is still moving across the peered connection. This can be done via Amazon VPC Reachability Analyzer, a configuration analysis tool that lets you perform connectivity testing between a source and a destination in VPCs. When the destination is reachable, Reachability Analyzer produces hop-by-hop details of the virtual network path between the source and the destination. When the destination isn’t reachable, Reachability Analyzer identifies the blocking component. For example, paths can be blocked by configuration issues in a security group, network ACL, route table, or load balancer. You can create a path analysis between source and destination as described in the getting started documentation. The two analyzes for the two pairs of instances are shown in Figure 6.

      Figure 6: VPC reachability analysis for Transit Gateway path and VPC Peering path

      Figure 6: VPC reachability analysis for Transit Gateway path and VPC Peering path

    8. This scenario concludes by having a split network, where traffic from subnet 2-VPC A and subnet 4-VPC B is still directed over the peering connection. Traffic from subnet1-VPC A and subnet3-VPC B have now been moved over to the Transit Gateway successfully.

      Figure 7: Split traffic between Transit Gateway and VPC Peering

      Figure 7: Split traffic between Transit Gateway and VPC Peering

Scenario 2: Migration of live traffic from Subnet 2-VPC A to Subnet 4 -VPC B in the same Region showcasing asymmetric routing effects

In this scenario, we look at migrating workload traffic that uses packet MTUs of 9001 bytes, and how that can cause issues for existing traffic flow when it’s moved from VPC peering to Transit Gateway. This leads to abnormal application behavior and potential outages.

This scenario comprises the following steps:

    1. Start an iPerf3 test using the -M flag to force the TCP MSS to 8949, or an MTU of 9001 and attempt the same migration using the remaining subnets: Subnet 2-VPCA (10.1.12.0/24) and Subnet 4-VPC B (10.2.12.0/24). For the test, EC2 instance 4 in Subnet 4-VPC B is the client and EC2 instance 2 in Subnet 2-VPC A is the server. Open up the route table of VPC-B and add a route entry for the traffic destined for Subnet 2-VPC A (10.1.12.0/24) subnet to be sent via the Transit Gateway attachment, as shown in Figure 8.

      Figure 8: Updated VPC-B Route Table with Entry for Subnet 2-VPC A over Transit Gateway

      Figure 8: Updated VPC-B Route Table with Entry for Subnet 2-VPC A over Transit Gateway

    2. Because VPC route table prefers the most specific prefix match during route evaluation, this means that the outgoing traffic routes over the Transit Gateway connection. However, until both route tables are updated, there is asymmetric routing, as the return traffic from instance 2 to instance 4 is directed over the VPC peering connection. Since the outgoing traffic uses an MSS of 8949 (i.e., MTU 9001), Transit Gateway drops those packets in Figure 9 and Figure 10 after the VPC B route table change is implemented.
      Figure 9: iPerf3 flows interrupted by the Transit Gateway attachment dropping packets

      Figure 9: iPerf3 flows interrupted by the Transit Gateway attachment dropping packets

      Figure 10: iPerf3 flows interrupted by the Transit Gateway attachment dropping packets

      Figure 10: iPerf3 flows interrupted by the Transit Gateway attachment dropping packets

    3. In the previous Steps 1 and 2 of this scenario, the client instance VPC’s route table was updated to prefer traffic over the Transit Gateway when routing to the 10.1.12.0/24 subnet. This means that the first leg of the network hop is the Transit Gateway attachment. Now let’s show what happens when the opposite occurs, and the client instance 4 sends traffic over the peering connection to instance 2, and the server instance 2 responds over a Transit Gateway. Before proceeding to the next step, revert the route table for VPC B by removing the 10.1.12.0/24 entry directed at the Transit Gateway.
    4. As before, begin an iperf3 test using the -M flag to force the TCP MSS to 8949, or an MTU of 9001. Open up the route table for subnet 2-VPC A, and then add a route entry for the traffic destined for Subnet 4-VPC B (10.2.12.0/24) to be sent via the Transit Gateway attachment, as shown in the following figure.

      Figure 11: Updated Route Tables in VPC-A and VPC-B

      Figure 11: Updated Route Tables in VPC-A and VPC-B

    5. However, in this case, we don’t see the same traffic interruption as before. This is because the onward traffic from the client instance 4 to server instance 2 is traversing over a peered connection. Therefore, none of the Transit Gateway limitations are enforced. The return traffic, i.e., from instance 2 to instance 4, is flowing over Transit Gateway but doesn’t get dropped. This is because iPerf3 is sending the 9001-byte MTU packets over the peering connection, while sending smaller sized packets, or perhaps none, in the reverse direction. You can see this behavior in Figures 12 and 13.
       Figure 12: Uninterrupted Traffic Flows over VPC Peering Connection

      Figure 12: Uninterrupted Traffic Flows over VPC Peering Connection

      Figure 13: Uninterrupted Traffic Flows over VPC Peering Connection

      Figure 13: Uninterrupted Traffic Flows over VPC Peering Connection

    6. If we run the iPerf3 test in the reverse mode using the -R flag (i.e., server sends and client receives), then we can again confirm the Transit Gateway MTU enforcement behavior, as the 9001-byte MTU packets being sent from the server instance 2 in Subnet 2-VPC A (10.1.12.0/24) are now hitting the Transit Gateway attachment and being dropped.

      Figure 14: Reverse iPerf flow is dropped by Transit Gateway attachment in Subnet 2-VPC A

      Figure 14: Reverse iPerf flow is dropped by Transit Gateway attachment in Subnet 2-VPC A

As you saw in this scenario, the asymmetric routing can introduce strange behavior depending on whether the traffic is unidirectional or bidirectional. To resolve this issue, you must update your application to use packets with an MTU smaller than 8500 bytes before you can complete the migration or re-establish the existing connections. You must also make sure that route tables are updated simultaneously to avoid asymmetric routing. Both should be handled as a service-impacting event and must be carefully planned.

Finally, to clean up and finish your migration, change the /16 CIDR to target the Transit Gateway connection, remove the /24 routes, as you now want all traffic to move over the Transit Gateway attachment and have no need for the specific subnet route changes. You can also finally confirm the traffic path by using VPC Reachability Analyzer on the pairs of EC2 instances (instance 2 and instance 4). You can delete and remove the peering relationship, as it’s no longer needed.

Figure 15: Updated Route Tables with /16 Entries for the Transit Gateway

Figure 15: Updated Route Tables with /16 Entries for the Transit Gateway

Scenario 3: Migration of traffic from Subnet 1-VPCA to Subnet 5-VPC C in two different Regions

In this scenario, you seamlessly migrate traffic from inter-Region VPC peering to a Transit Gateway peering connection. The migration of this traffic has similar steps as above, as well as similar considerations regarding the MTU size. However, the only difference is that traffic over inter-Region VPC peering is limited to a MTU of 1500 bytes, as compared to the 9001 bytes supported by intra-Region VPC peering. Therefore, in this case, when you’re migrating traffic from inter-Region VPC peering to Transit Gateway peering connection, packet drops due to the exceeding of the MTU limit on Transit Gateway is unlikely.

Considerations

  • Although in this example, the iPerf connection was maintained during the migration, there’s no shared session state between VPC Peering and Transit Gateway connections. Therefore, any long-standing connections (such as database connections or similar) must be re-established. AWS recommends planning this as a service-impacting event, and you should schedule a maintenance window for when these connections can be dropped and re-established.
  • In most cases, your application uses the MTU setting of the EC2 instances in which they are hosted, and the MTU setting negotiated between your EC2 instances is used for sending the packets. To check and set the MTU for the instances running your applications, refer to the EC2 Network MTU documentation.
  • You must consider Transit Gateway design best practices. Because intra-Region traffic crosses an additional hop when traversing the AWS Hyperplane network via Transit Gateway, this can add additional latency as compared to VPC peering. Understand your network and application latency requirements and make sure that you test thoroughly before introducing network topology changes.
  • Network Address Usages (NAUs) are metrics applied to resources within a VPC to help you plan for and monitor the size of your VPC. Each NAU unit contributes to the total that represents your VPC. These NAU units are calculated based on whether the VPCs are peered, in the same or separate AWS Regions, or are connected via a Transit Gateway. Therefore, as you grow and scale your VPCs, you must understand the total number of units that make up the NAU of your VPC to know how to size properly. Refer to the NAU documentation for details and examples, and Designing hyperscale Amazon VPC networks post for details.

Summary

In this post, we presented various scenarios to migrate from a VPC peering to Transit Gateway based network architecture. As your AWS environments and VPCs continue to scale with your business growth, you must plan your network architecture to support flexibility, ease of use, and scalability. This reduces the amount of manual effort in connecting disparate networks, managing route tables, and securing networks. Through careful planning and understanding the technical differences between VPC peering and Transit Gateway, you can migrate to scalable AWS deployments using Transit Gateway.

Anandprasanna Gaitonde

Anandprasanna Gaitonde is an AWS Solutions Architect, responsible for helping customers design and operate Well-Architected solutions to help them adopt AWS cloud successfully. He focuses on AWS Networking & Serverless technologies to design and develop solutions in the cloud across industry verticals. He has Solutions Architect Professional and Advanced Networking certifications and holds a Master of Engineering in Computer Science and post-graduation degree in Software Enterprise Management.

Jacob Walker

Jacob Walker is a Solutions Architect that is passionate about exploring the art of the possible on AWS by aiding customer innovation while deploying highly scalable, secure, available and cost effective workloads. Jacob holds a BS in Electrical Engineering and MS in Information Technology Management. Outside of AWS, Jacob enjoys weightlifting, reading, and spending time with family.

Varun Mehta

Varun Mehta is a Solutions Architect at AWS. He is passionate about helping customers build Enterprise-Scale Well-Architected solutions on the AWS Cloud and specializes in the Networking domain. He has 14 years of experience in designing and building various complex networking solutions for Enterprise and DataCenter customers.