AWS Cloud WAN and AWS Transit Gateway migration and interoperability patterns
At AWS re:Invent 2021, we launched a public preview of AWS Cloud WAN, a managed service for creating a global network using AWS global network infrastructure. Cloud WAN makes it easy to build and operate global wide area networks (WAN) to connect your data centers, branch offices, and Amazon Virtual Private Clouds (VPCs). Cloud WAN provides you with a single console and set of APIs to manage and monitor your network across AWS Regions. And, Cloud WAN works with existing networking constructs like AWS Transit Gateway (TGW).
Now that Cloud WAN is generally available, you can start using it in production. If you plan a greenfield deployment, where you are building an entirely new network from the ground up, consider following the process outlined in the Getting started with AWS Cloud WAN section of our documentation. If you have an existing global network that uses AWS Transit Gateway, this post is for you. Here we discuss interoperability design patterns and how to migrate from Transit Gateway to Cloud WAN. We include custom design patterns (such as integrating with AWS Direct Connect), Software Defined Wide Area Networks (SD-WAN) using Transit Gateway Connect, and centralizing firewalls.
We will explore three scenarios in this blog post using a simple architecture with three AWS Regions to illustrate the process:
- Before Migration – The starting point is a Transit Gateway-based network mesh.
- Federating Transit Gateways with Cloud WAN – In this model you replace statically created Transit Gateway peering connections with Cloud WAN. This results in simple connectivity and global dynamic routing. If you like, you can stop your migration here and use Cloud WAN as a dynamic routing hub for your Transit Gateways.
- Cloud WAN only – this is the final stage of the migration where Cloud WAN is used for all connectivity and Transit Gateways are removed.
Finally, we cover additional Cloud WAN architectures, focusing on hybrid connectivity (i.e., AWS Direct Connect) as well as integrating with centrally deployed ingress or egress firewalls (we’re showing AWS Network Firewall as an example, but the same architecture pattern applies to deployments using 3rd party firewall vendors).
(Of course, you don’t have to migrate from Transit Gateway to Cloud WAN unless you want to. If you are happy with your current architecture, great!)
Cloud WAN general availability partners:
At general availability, Cloud WAN integrates with leading SD-WAN, network appliance, ISV, and systems integrators. These partners have provided us with tons of helpful feedback and are ready to help customers get started with Cloud WAN. Here are some blog posts and other resources that share their experiences:
- Aruba (HPE) – Aruba EdgeConnect Enterprise and AWS Cloud WAN
- Aviatrix – Unlock Advanced Networking & Security Capabilities by Integrating Aviatrix with AWS Cloud WAN
- Check Point – Check Point Software Technologies announces the integration of CloudGuard Network Security with AWS Cloud WAN
- Cisco Meraki – Streamlining Connectivity for a Multi-Region Hybrid World
- Cisco Systems – Cisco SD-WAN with AWS Cloud WAN for an on-demand global cloud network
- Prosimo – AWS Cloud WAN: A cloud-native attach paradigm to simplify global connectivity and segmentation
- VMware – Simplify Automated Reference Deployments in the Cloud with VMware SD-WAN and AWS Quick Start
- Accenture – AWS Partner Accenture
- Deloitte – What is AWS’s new Cloud WAN, and how will it impact Cloud networking?
- DXC Technologies – AWS Cloud WAN software-defined networking
- Kyndryl – Creating an optimized global enterprise network with AWS Cloud WAN
- Slalom – Network Segmentation using AWS Cloud WAN
Before migration – Transit Gateway Mesh
We start off with three AWS Regions with a Transit Gateway in each. We have two environments, production (prod) and development (dev). Traffic on each should never mix. Because Transit Gateway is a Regional construct, its route tables are local to the Region where it was deployed. Even though we have the same environment in each AWS Region, we must create unique Transit Gateway route tables for each one and populate them with static routes for cross-Region communication.
Note that you can only associate an inter-Region peering attachment with a single Transit Gateway route table. Therefore, that route table must know about both prod and dev routes.
In figure 1, we show only the route tables that apply to communication from VPC 1 and VPC 3. Both are part of the prod environment.
Traffic flowing from VPC 1 to VPC 3 follows these steps:
- VPC 1 sends all traffic to TGW A based on its default route. VPC 1 attachment to TGW A is associated with route table Production A. That’s the route table that TGW A uses to decide how to forward traffic.
- TGW A does a lookup in its route table – Production A – and uses a route to VPC 3 using TGW inter-Region peering to TGW B. The peering attachment on TGW B is associated with route table Peering B, which has routes to both VPC 3 and VPC 4. This table must have all routes because it might receive traffic for both environments.
- TGW B does a route lookup in the Peering B route table and forwards the traffic to the appropriate attachment.
Return traffic (not shown) follows the same path in the opposite direction.
Federating Transit Gateway and Cloud WAN
To begin your migration, you must first create your Cloud WAN global network and setup appropriate network segments to decide how your traffic is separated into logical groups. You can follow the same segmentation strategy that you have on your Transit Gateways. Like the preceding scenario, we only have prod and dev environments in this case. We’ll map those environments to two Cloud WAN core network segments.
Once your Cloud WAN core network is ready, create peerings with your existing Transit Gateways and configure attachments between the Transit Gateway route tables and Cloud WAN segments. Cloud WAN uses dynamic routing/Border Gateway Protocol (BGP) over peering connections. Therefore, your Transit Gateway and Cloud WAN BGP autonomous system number (ASN) must be unique. Also, the peering between Transit Gateways and Cloud WAN is supported in the same Region, but not across Regions.
Routing on Cloud WAN is controlled through core network policies. Once you configure appropriate policies, traffic can flow accordingly. Any routes that appear in the Transit Gateway route table are dynamically advertised, through a Cloud WAN segment, to other Transit Gateways that map to the same segment. Note that the entire Transit Gateway does not map to a Cloud WAN segment–only a single Transit Gateway Route Table (TGW-RT). Therefore, only routes from that particular TGW-RT are dynamically propagated to other TGW-RT that are attached to the same Cloud WAN segment.
The diagram that follows (figure 2) shows Cloud WAN and Transit Gateway peering running in parallel and interconnecting all AWS Regions. At this stage, traffic flow would still prefer Transit Gateway peering over newly added Cloud WAN routes. That’s because Transit Gateway peering is using static routes that are preferred over dynamically learned Cloud WAN routes. Refer to our How Transit Gateways works documentation for Transit Gateway route evaluation order.
Once you connect all Transit Gateways to Cloud WAN, and validate that segments are populated with the expected routes, you can start removing the peering static routes on the Transit Gateway. When the static routes are gone, the dynamic routes that are shared using the Cloud WAN segment take over.
The diagram that follows (figure 3) shows the ultimate state of the network. The Transit Gateway cross-Region peering attachments, and the route tables used for handing traffic arriving over them, have been removed. All traffic across AWS Regions flows over Cloud WAN.
Traffic from VPC 1 (prod in Region A) to VPC 3 (prod in Region B) follows these steps:
- VPC 1 continues using its default route to TGW A in the same Region. VPC 1 attachment on TGW A is associated to Production A route table, which is used to lookup how to send traffic destined to VPC3.
- TGW A forwards traffic destined to VPC 3 over its attachment to Cloud WAN core network prod segment.
- Cloud WAN prod segment knows about all the routes for prod VPCs and forwards traffic to VPC 3 using its attachment to TGW B.
- TGW B does a lookup for VPC 3 in the Production B route table and forwards traffic to the VPC 3 attachment.
You might choose to finish your migration here and use Cloud WAN to simplify connectivity between your Transit Gateways. This is attractive when you have use cases that rely on Transit Gateways such as integrating with AWS Direct Connect. Or, you might choose to take the next step.
Cloud WAN only
To migrate completely off Transit Gateway, you must connect your VPCs directly to Cloud WAN. The first stage is to create VPC attachments to Cloud WAN and map them to the correct segments.
The diagram that follows (figure 4) shows a zoomed-in view of a single AWS Region where VPCs are connected to both local Transit Gateways and Cloud WAN. Cloud WAN is represented in each Region by its core network edge (not shown in previous diagrams for simplicity). Note that, similar to Transit Gateways, each core network edge must have a unique ASN.
The static routes inside the VPC still send all traffic through TGW A. Traffic inside the Region uses local Transit Gateway only bypassing Cloud WAN.
However, traffic crossing the Region boundary flows asymmetrically, leaving the VPC using Transit Gateway and coming back to the VPC over the Cloud WAN attachment. This asymmetric flow works as long as there are no stateful network services in the path (stateful firewalls, NAT). Stateless firewalls (e.g., Network ACLs) are not affected.
Once you have attached all of your VPCs to Cloud WAN, it’s time to update the static routes in your VPC route tables. You set them up to route all traffic over the Cloud WAN core network instead of over Transit Gateway. Changing a route on a VPC results in asymmetric traffic flow between that VPC and others that are still using Transit Gateway, even in the same AWS Region. Just as mentioned earlier, as long as there are no stateful components in the path, this should not affect the traffic flow.
Note, changing the VPC route table using the AWS Management Console results in downtime as it requires a delete and re-add operation. However, if you use the Amazon EC2 ReplaceRoute API instead, you can make the change in one step. Refer to the ReplaceRoute API documentation for details.
Once all routes have updated, you can remove the Transit Gateways. Figure 5 shows the end result of this process in a single Region.
Here, the global architecture and traffic flows are much simpler. Traffic between VPCs in the same segment can use Cloud WAN to connect. Figure 6 shows the post-migration traffic flow, including the Cloud WAN network edges that Cloud WAN deployed in each Region as part of your core network.
Traffic from VPC 1 (prod in Region A) to VPC 3 (prod in Region B) follows these steps:
- VPC 1 is sending all its traffic to the core-network over the Cloud WAN attachment.
- Cloud WAN applies the routing rules in the prod segment route table and forwards the traffic directly to VPC 3 in Region B over the appropriate VPC attachment.
Cloud WAN hybrid Architectures
Cloud WAN attachments are connections or resources that you want to add to your core network. Supported attachments currently include VPC, Site-to-Site VPN, Transit Gateway (TGW) route table attachments, and Connect (SD-WAN/GRE) attachments. Today, Cloud WAN does not support native integration with AWS Direct Connect. For use cases that require AWS Site-to-Site VPN connections over Direct Connect using private IP addresses, you must connect Cloud WAN with a Transit Gateway.
Site-to-Site VPN and Connect (SD-WAN) integration with Cloud WAN
Cloud WAN natively supports AWS Site-to-Site VPN and SD-WAN/GRE attachments (using Transit Gateway Connect), providing native capability to connect on-premises resources to AWS using these connections. In Figure 7, we show on-premises resources connected over VPN and SD-WAN/GRE attachments to a Cloud WAN core network. We mapped both attachments to the hybrid segment in their own Regions.
Traffic flowing from VPC 1 to the physical premises follows these steps:
- VPC 1 is sending all traffic to Cloud WAN based on its default route, using its attachment to the Cloud WAN core network. VPC 1 attachment on Cloud WAN is mapped to the prod segment, which is used to lookup how to send traffic to on-premises resources using VPN and SD-WAN/GRE.
- Cloud WAN prod segments are explicitly configured using segment sharing to know about all the routes for on-premises resources, and forwards traffic to on-premises resources using its VPN attachment. Similarly, for SD-WAN/GRE, it’s respective connect attachment is used to forward traffic to on-premises resources.
- For the return path, VPN (and SD-WAN/GRE) does a lookup for VPC 1 in its route table and returns the traffic over its attachment to the Cloud WAN core network to the hybrid segment. The Cloud WAN hybrid segment knows about all of the routes for attached VPCs and forwards traffic to VPC 1 using its attachment.
Direct Connect integration with Cloud WAN
Cloud WAN currently does not natively support AWS Direct Connect attachments. In order to use Direct Connect with Cloud WAN, you need a Transit Gateway attached to a Direct Connect Gateway. This approach is similar to the federated model described earlier, where both Cloud WAN and Transit Gateway co-exist. Note that if you want SD-WAN/GRE (Connect) from your premises to route over Direct Connect you can do so using this federated Transit Gateway model.
The diagram that follows (figure 8) shows the architecture where on-premises resources are connected to Cloud WAN using Direct Connect, a Direct Connect Gateway, and Transit Gateway. In order for on-premises resources to reach VPC 1 in Region A, VPC 2 in Region B, and VPC3 Region C, Cloud WAN must be peered with a Transit Gateway. Cloud WAN has three segments (dev, prod and hybrid). We attached VPC 1 and VPC 3 to the prod segment, VPC 2 to the dev segment, and Transit Gateway to the hybrid segment. In this configuration, the traffic flow is North-South, meaning it moves from your premises, to AWS, and back.
Traffic flowing from VPC 1 to your premises follows these steps:
- VPC 1 sends all traffic to Cloud WAN based on its default route, using its attachment to the Cloud WAN core network. The VPC 1 attachment to Cloud WAN is mapped to the prod segment, which is used to lookup how to send traffic to on-premises resources through TGW A.
- The Cloud WAN prod segment is explicitly configured using segment sharing to know about all routes to on-premises resources. It forwards traffic to TGW A using its route table attachment over peering connection. TGW A forwards the traffic to on-premises resources using the Direct Connect Gateway.
- For the return path, TGW A does a lookup for VPC 1 in its route table and returns the traffic over its attachment to the Cloud WAN core network in the hybrid segment.
- The Cloud WAN hybrid segment knows about all routes by sharing routes from other segments, including attached VPCs, and forwards traffic to VPC 1 using its attachment.
Cloud WAN integration with AWS Network Firewall
Centralized Ingress-Egress (North-South) Firewall Inspection architecture
If you want to inspect and filter your egress traffic, you can incorporate AWS Network Firewall with NAT gateway in your centralized egress architecture (Figure 9). In centralized egress inspection, traffic from private subnets within a VPC is routed through Cloud WAN and then over a separate Egress VPC with AWS Network Firewall. Then it is routed through a NAT gateway that performs network address translation (NAT) for the traffic that flows out to the internet. Return packets follow the same path in reverse.
In the diagram that follows (figure 9), we use this pattern for both prod and dev traffic that is following the centralized egress path. This architecture is for outbound/egress connections only, as the NAT gateway cannot accept inbound connections from the internet. Centralized Ingress inspection works similarly, where all traffic into your AWS network is proxied through an Ingress VPC hosting the firewall. From there, it gets forwarded to the target application in another spoke VPC using Cloud WAN. For more details on ingress inspection architectures, refer to the Design your firewall deployment for internet ingress traffic flows blog post.
Egress traffic flowing from VPC 1 to the internet follows these steps:
- VPC 1 sends all traffic to Cloud WAN based on its default route and using its attachment to the Cloud WAN core network. The VPC 1 attachment on Cloud WAN is mapped to the prod segment, which is used to lookup how to send traffic to internet.
- The Cloud WAN prod segment knows about all routes for egress, and forwards the traffic directly to the Egress VPC using the appropriate VPC attachment. Then, that traffic is routed through a NAT gateway (not shown in Figure 9) that performs network address translation (NAT) to the traffic that flows out to the internet.
- Return packets follow the same path in reverse. Egress VPC attachment to Cloud WAN is mapped to the Security segment that is used to lookup how to send response traffic back to VPC 1.
- Cloud WAN applies the routing rules in the Security segment route table and forwards the traffic directly to VPC 1 over the appropriate VPC attachment.
- Cloud WAN and Transit Gateway have a data processing fees applied to data coming in to each respective service (Cloud WAN pricing, Transit Gateway pricing). However, when traffic arrives from Transit Gateway to Cloud WAN or vice versa, you do not pay any processing fees. If you peer Transit Gateway with Cloud WAN you only pay for data transfer once.
- During migration, there will be times where traffic will flow asymmetrically (called out above). Make sure there are no components that maintain state in the path (i.e., NAT, stateful firewalls etc.) as they would drop asymmetric flows.
- To configure routing on Cloud WAN segments you must use core network policies, not covered in this blog post. To learn more about how they work refer to our Create a core network policy version documentation or the Introducing AWS Cloud WAN blog post.
- Cloud WAN segments are similar in concept to Transit Gateway route tables. To make more complex architectures work (i.e. Firewalling or Direct Connect) you must use segment sharing to ‘leak’ relevant routes between segments or configure static routes in each segment.
- Cloud WAN and Transit Gateway both support IPv4 and IPv6.
AWS Cloud WAN makes it easy to build, manage, and monitor a unified global network that connects resources running across your cloud and on-premises environments. It provides a central dashboard for attaching the connections between your branch offices, data centers, and Amazon VPCs to the AWS global network—in just a few clicks.
This post discussed different architectures covering interoperability, migration, and federated models that meet the needs of common uses cases. And, we shared examples of how to integrate your existing Transit Gateway-based network with Cloud WAN. To get started on AWS Cloud WAN today, visit our documentation.
For more information about AWS Cloud WAN, you can refer to the following resources:
- Introducing AWS Cloud WAN (Preview) blog post
- AWS re:Invent 2021 breakout session (video) – Introducing AWS Cloud WAN and AWS Direct Connect SiteLink
- AWS Twitch Networking & Content Delivery Office Hours | The routing loop – Cloud WAN
- AWS Cloud WAN Workshop