Networking & Content Delivery

Advanced hybrid routing scenarios with AWS Cloud WAN and AWS Direct Connect

Introduction

In this post, we review advanced global routing scenarios with AWS Cloud WAN and AWS Direct Connect and dive into how you can control routing to build connectivity between AWS and on-premises locations. We also share best practices for optimizing routing in multi-Region hybrid networks and review common high-availability settings and failover scenarios.

Customers with hybrid workloads across multiple AWS Regions can deploy AWS Cloud WAN global networks and connect to their on-premises locations through AWS Direct Connect, AWS Site-to-Site VPN, or SD-WAN solutions. AWS Cloud WAN provides the flexibility to add or remove Regions to a Cloud WAN global network in a matter of minutes and ensures end-to-end dynamic routing on AWS. Through the integration with AWS Direct Connect, you can build dedicated, highly available connectivity with on-premises locations worldwide.

Prerequisites

This is a 300-level post. We assume that you are familiar with networking constructs on AWS, including Amazon Virtual Private Cloud (Amazon VPC), AWS Transit Gateway, AWS Direct Connect, and AWS Cloud WAN. We won’t focus on defining these services, but we do outline their capabilities and how you can use them for the highlighted routing scenarios. We also recommend you review the best path selection algorithms for AWS Transit Gateway and AWS Cloud WAN.

Scenarios

We cover three common global routing scenarios with AWS Cloud WAN and AWS Direct Connect. In each scenario, we review Border Gateway Protocol (BGP) route advertisements, route selection, traffic flows, and an example of a failure scenario:

  1. In Scenario 1, we focus on a single data center location, with two AWS Direct Connect connections associated with a single AWS Region, and how connectivity is established with a multi-Region environment on AWS. The on-premises location advertises an IP prefix to AWS. Commonly, the prefixes advertised from on-premises are either summaries of on-premises routes, RFC1918 routes (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), or a default route.
  2. In Scenario 2, we expand the on-premises footprint to two locations, with two AWS Direct Connect connections each. The AWS Direct Connect connections are associated with two AWS Regions, and we review how connectivity can be established with a multi-Region environment on AWS. The number of on-premises locations can be easily expanded, as well as the number of Regions on AWS. Each on-premises location advertises one or more unique, nonoverlapping IP prefixes to AWS. These IP prefixes are usually specific to the on-premises summary addresses.
  3. In Scenario 3, we maintain the two on-premises locations outlined in Scenario 2. The difference is that both on-premises locations advertise the same IP prefix to AWS. Commonly, the routes advertised from on premises are either the IPv4 RFC1918 prefixes, IPv6 summary routes, or default routes (0.0.0.0/0; ::/0). This scenario is common for customers who maintain connectivity between the on-premises locations through a data center interconnect. Connectivity between on-premises locations is important in this routing scenario because it ensures that traffic can be routed from AWS to specific destinations on-premises.

These three scenarios can be expanded to use cases where you have multiple on-premises locations and a larger number of AWS Regions, following the same routing principles described in this post.

Considerations

Here are some initial considerations that guide our approach to the three scenarios, and the routing configurations we show:

  1. Currently, AWS Cloud WAN does not support AWS Direct Connect gateway attachments. In order to integrate AWS Direct Connect with AWS Cloud WAN, you must peer Transit Gateway with AWS Cloud WAN. Routes are dynamically propagated between Transit Gateway and AWS Cloud WAN, and no additional data processing charges are incurred on the peering connection between the two.
  2. We recommend using a Transit Gateway for AWS Direct Connect connectivity in each AWS Region where AWS Cloud WAN is deployed. Each Transit Gateway can be attached to one or more AWS Direct Connect gateways, depending on the scenario. This ensures that workloads in each Region send and receive traffic directly to and from the AWS Direct Connect gateway without creating dependencies on other AWS Regions.
  3. AWS Cloud WAN automatically deploys and manages routing components of your global network on AWS. The core network edges (CNEs) are the Regional connection points deployed and managed by AWS, similar to AWS Transit Gateways. Each CNE has an individual BGP Autonomous System Number (ASN) and is peered in a full mesh using external BGP (eBGP) with all other CNEs.
  4. When multiple AWS Direct Connect connections are available, we recommend using the supported AWS Direct Connect BGP community tags and Multi-Exit Discriminator (MED) to influence how the AWS Direct Connect gateway selects the best path for traffic from AWS to on premises. Setting up these attributes does not influence how AWS Transit gateway and AWS Cloud WAN route traffic. BGP community tags are specific to AWS Direct Connect, and MED is a nontransitive BGP attribute that does not propagate beyond the AWS Direct Connect gateway.
  5. For routes advertised from on premises to AWS with the same destination IP address, AWS Cloud WAN CNEs prefer the shortest AS_PATH. In this post, we use BGP local preference communities to influence best path selection and ensure we have AS_PATH of equal length on all routes advertised from on-premises. AWS Cloud WAN CNEs receive on-premises routes from the peered Transit Gateways and from all other CNEs. Here, AS_PATH through the directly peered Transit Gateway is shorter. Therefore, each CNE prefers the route received from the Transit Gateway in the local Region rather than a route received from a CNE in a remote Region. This helps avoid suboptimal routing in the global network.
  6. IP prefix overlaps are not allowed when multiple transit gateways are associated with an AWS Direct Connect gateway. For instance, if one transit gateway has an allowed prefix list containing 10.1.0.0/16, and another transit gateway has a list that includes 10.2.0.0/16 and 0.0.0.0/0, it is not possible to associate the second transit gateway to the AWS Direct Connect gateway.

Baseline architecture

For this post, we start with a baseline architecture shown in the following diagram (Figure 1). We deploy AWS Cloud WAN in three AWS Regions: ap-southeast-2, us-west-2 and us-east-1, and an AWS Transit Gateway in each. VPCs in each Region are attached to AWS Cloud WAN and associated with the default segment. Throughout this post, we use only the default segment for simplicity reasons; however, the same principles apply when working with multiple segments.

Figure 1: Baseline architecture used in this post

Scenario 1: A single on-premises location with two AWS Direct Connect connections associated with a single AWS Region. The on-premises location advertises a unique IP prefix to AWS.

We show the setup for this scenario in the following diagram (Figure 2). For simplicity, we are not showing a maximum availability connectivity scenario using AWS Direct Connect, but this design is easily expanded to accommodate for the desired redundancy.

Figure 2: Scenario 1 – A single on-premises location, with two AWS Direct Connect connections associated with a single AWS Region

1. BGP route advertisements from on-premises to AWS

In this scenario, we configured two transit virtual interfaces (VIFs), one on each Direct Connect connection, and associated them with an AWS Direct Connect gateway. The following diagram (Figure 3) shows the BGP route advertisements from on premises to AWS and how routes propagate in the AWS global network:

Figure 3: Scenario 1 – On premises to AWS route propagation

  • (A), (B) – We advertise the same IPv4 and IPv6 prefixes on both Transit VIFs (10.0.0.0/8 – IPv4 and 2001:db8::/32 – IPv6). Our intent is to use one VIF (VIF-1) as main, and a second VIF (VIF-2) as backup, so we use BGP community tags to achieve this:
    • (A) On transit VIF-1, we advertise routes with BGP community “7224:7300” (high preference).
    • (B) On transit VIF-2, we advertise routes with BGP community “7224:7100” (low preference).

If you set the same BGP community on the advertised routes, you can load balance traffic over both VIFs.

  • (C) – The Direct Connect gateway receives all routes and advertises all prefixes to the associated transit gateways. The BGP communities are kept unmodified, and the AS_PATH for all routes is updated with the Direct Connect gateway ASN. The transit gateways ignore the BGP communities, and perform best path selection, installing one route in their route table. The next hop of the routes is the Direct Connect gateway, so it doesn’t matter if (A) or (B) is selected.
  • (D) – Each transit gateway advertises the best path installed in its route table to the peered CNE in the same Region. The AS_PATH for each route is updated to include the transit gateway ASN. Each CNE learns the IPv4 and IPv6 prefixes with the next hop as the peered transit gateway.
  • (E) – Each CNE advertises the IPv4 and IPv6 routes to all other CNEs in remote Regions after adding its own ASN in the BGP AS_PATH. When selecting the best path, each CNE considers the shortest AS_PATH for the routes from on-premises through the peered transit gateway.

2. BGP route advertisements from AWS to on premises

We addressed VPCs in AWS Regions with IPv4 and IPv6 CIDRs that are easy to summarize. This best practice simplifies route advertisements on the AWS Direct Connect Gateway and helps maintain route tables of manageable sizes. The following diagram (Figure 4) shows the BGP route advertisements from AWS to on premises:

Figure 4: Scenario 1 – VPC CIDR propagation from AWS to on-premises

  • (A) – All VPC CIDRs (IPv4 and IPv6) are automatically propagated in the default segment according to the AWS Cloud WAN policy configuration.
  • (B) – VPC CIDRs are propagated to the transit gateways peered in each Region in the attached route table.
  • (C) – When you associate a transit gateway with the Direct Connect gateway, you must configure a prefix list for routes that are advertised to on premises. In this example, each transit gateway association prefix list references the summarized CIDRs for the respective AWS Region.
  • (D) – The Direct Connect gateway advertises all prefixes configured in the transit gateway associations to the associated VIFs. On the on-premises routers, you must set BGP attributes such that traffic originating from on premises prefers Transit VIF-1 over Transit VIF-2.

3. Traffic flows

AWS to on-premises traffic flows

Traffic flows between VPCs and on-premises resources are shown in the following diagram (Figure 5):

Figure 5: Scenario 1 – Traffic flows between AWS and on-premises resources

  • (1) – VPCs route traffic to on-premises through the AWS Cloud WAN attachment based on their route table configuration.
  • (2) – Each CNE routes traffic to on premises based on the best path, with the next hop being the local transit gateway.
  • (3) – Transit gateways in each Region send traffic to on premises through the Direct Connect gateway.
  • (4) – The Direct Connect gateway performs a route lookup and forwards traffic through the Primary VIF (VIF-1).

Traffic from on premises to AWS is forwarded symmetrically, following the same paths shown in the previous figure (Figure 5).

4. Example of a failure scenario

If the primary VIF (VIF-1) becomes unavailable, the backup VIF (VIF-2) is the preferred path, and traffic continues to be routed between AWS and on premises, avoiding downtime. In the failover state, routes to on premises change next-hop only on the Direct Connect gateway from the primary VIF to the secondary VIF. Transit gateways and AWS Cloud WAN continue to route traffic, as previously depicted in Figure 5. Once the primary VIF recovers, traffic switches back to the primary VIF.

Scenario 2: Two on-premises locations, with two AWS Direct Connect connections each, associated with two AWS Regions. Each on-premises location advertises unique, nonoverlapping IP prefixes to AWS.

In this scenario, we deploy AWS Cloud WAN across multiple AWS Regions. We configured a Transit Gateway for AWS Direct Connect connectivity in each Region, following the baseline shown in Figure 1. The two on-premises locations are connected to AWS using two AWS Direct Connect connections each, associated with two of the AWS Regions where workloads reside. We deployed a Transit VIF on each AWS Direct Connect connection, and we advertised unique IP prefixes from each on-premises location, as shown in the following diagram (Figure 6):

Figure 6: Scenario 2 – Two on-premises locations, with two AWS Direct Connect connections each, associated with two AWS Regions. Each on-premises location advertises unique, nonoverlapping IP prefixes to AWS.

1. BGP route advertisements from on premises to AWS

We advertise unique prefixes from each on-premises location and use local preference BGP community tags to achieve a primary/secondary architecture similar to scenario 1. Figure 7 shows the route advertisements:

Figure 7: Scenario 2 – BGP route advertisements from on premises to AWS

Route advertisements from on premises make use of BGP communities to ensure that the Direct Connect gateway chooses the primary or backup paths for each prefix:

  • (A) 10.1.0.0/16 and 10.2.0.0/16 with “7224:7300” (high preference) on VIF 1 and VIF 3
  • (B) 10.1.0.0/16 and 10.2.0.0/16 with “7224:7100” (low preference) on VIF 2 and VIF 4

Routes are propagated from the Direct Connect gateway to the transit gateways and to the CNEs, following the same route propagation mechanism explained in Scenario 1.

2. BGP route advertisements from AWS to on premises

The routes from AWS to on premises are advertised the same as in Scenario 1.

3. Traffic flows

AWS to on-premises traffic

The following diagram (Figure 8) shows the traffic flows between a VPC and on-premises resources with IP addresses in the IP ranges of each on-premises location:

Figure 8: Scenario 2 – Traffic flows from AWS to on premises

  • (1) – VPC routes traffic to on premises through the AWS Cloud WAN attachment based on its route table configuration.
  • (2) – CNE performs a route table lookup and forwards traffic towards the local transit gateway.
  • (3) – transit gateway performs a route table lookup and forwards traffic to the Direct Connect gateway.
  • (4) – Direct Connect gateway performs a route table lookup and forwards traffic destined to the US West on-premises location to VIF-1 and traffic destined to the US East on-premises location to VIF-3. The Direct Connect gateway makes the routing decision based on the destination IP address, and the primary VIF for each on-premises location is selected based on the BGP community tags.

On premises to AWS traffic is forwarded symmetrically, following the same paths.

4. Example of a failure scenario

For each on-premises location, if any of the primary VIFs (ViF-1 and VIF-3) fail, traffic is routed over the backup VIFs (VIF-2 and VIF-4). We recommend designing AWS Direct Connect connectivity for your availability needs, following the AWS Direct Connect resiliency guide.

Scenario 3: Two on-premises locations, with two AWS Direct Connect connections each, associated with two AWS Regions. Each on-premises location advertises the same IP prefixes to AWS, and there is connectivity between on-premises locations.

Advertising the same summary IP prefix from multiple on-premises locations to AWS creates an “anycast” routing setup. The connectivity between on-premises locations is important in this routing scenario, as it ensures traffic can be routed from AWS to on-premises destinations using the nearest AWS Direct Connect Point of Presence (POP). When needed, traffic can traverse the on-premises backbone to reach the final destination.

We introduce an additional AWS Direct Connect gateway to control how traffic flows from AWS to on premises:

  • We advertise 10.0.0.0/8 – IPv4 and 2001:db8::/32 – IPv6 from both on-premises locations on all four Transit VIFs. No other more specific routes are advertised from on premises to AWS.
  • We associate VIFs in each AWS Direct Connect POP to a separate Direct Connect gateway – VIF-1 and VIF-2 are associated with Direct Connect gateway A, while VIF-3 and VIF-4 are associated with Direct Connect gateway B.
  • We configure BGP local preference communities on routes from on premises to ensure the Direct Connect gateways choose primary/backup VIFs. You can also load balance traffic across both VIFs in an AWS Direct Connect POP by setting identical BGP attributes.

We call these behavior-based Direct Connect gateways because they accommodate a specific routing configuration for traffic from AWS to on premises. You can define your own desired behavior for each Region through a dedicated behavior-based Direct Connect gateway.

The setup for this scenario is shown in the following diagram (Figure 9):

Figure 9: Scenario 3 – Multi-Region AWS Cloud WAN setup with a transit gateway on each Region, two Direct Connect gateways, multiple AWS Direct Connect connections associated with two Regions and a backbone network connecting the colocation facilities.

1. BGP routes advertisements from on premises to AWS

Each on-premises location is configured with unique, nonoverlapping IP prefixes, but we advertise one common IP prefix (that is, RFC 1918 IPv4 summary routes) to AWS on all four Transit VIFs. There are various reasons you may choose to advertise the same prefix from multiple on-premises locations, including IP addressing schemes that don’t allow for easy summarization.

The behavior-based Direct Connect gateways choose their best path based on the BGP communities. In this example, VIF 1 is primary and VIF 2 is backup for Direct Connect gateway A, while VIF 3 is primary and VIF 4 is backup for Direct Connect gateway B. We don’t use AS_PATH prepending, according to the considerations highlighted at the beginning of this post.

Each Regional transit gateway is attached to a Direct Connect gateway with a specific routing behavior based on its geographic location and the desired routing path for traffic from the Region to on-premises. In this example, the AWS Regions that are geographically close to the US West on-premises location use Direct Connect gateway A, while the AWS Regions that are geographically close to the US East on-premises location use Direct Connect gateway B.

2. BGP route advertisements from AWS to on premises

The route advertisements from AWS to on premises are configured the same as in Scenario 1. In this scenario, each on-premises location learns the prefixes of AWS Regions outside of its geographical proximity through the on-premises backbone connectivity.

3. Traffic flows

Let’s consider the first example shown in the following diagram (Figure 10), where traffic flows between a VPC in ap-southeast-2 and an on-premises resource with the IP address 10.1.1.1 in the US West on-premises location:

Figure 10: Scenario 3 – Traffic flow from AWS to a resource with IP address 10.1.1.1 in US West on-premises location

AWS to on-premises traffic

  • (1) – VPC routes traffic to on premises through the AWS Cloud WAN attachment based on its route table configuration.
  • (2) – CNE forwards traffic to the local transit gateway in ap-southeast-2.
  • (3) – Transit gateway sends traffic to Direct Connect gateway A.
  • (4) – Direct Connect gateway A sends the traffic to the primary VIF (VIF-1).

On-premises to AWS traffic is forwarded symmetrically, following the same path.

Let’s consider a second example, where the traffic flows between the same AWS Region (ap-southeast-2) and an on-premises resource with an IP address 10.2.2.2 in the US East on-premises location. In this case, traffic traverses the backbone between on-premises locations, as depicted in Figure 11:

Figure 11: Scenario 3 – Traffic flow from AWS to a resource with IP address 10.2.2.2 in US East on-premises location

AWS to on-premises traffic

  • (1) – VPC routes traffic to on-premises through the AWS Cloud WAN attachment based on its route table configuration.
  • (2) – CNE forwards traffic to the local transit gateway in ap-southeast-2.
  • (3) – Transit gateway sends the traffic to Direct Connect gateway A.
  • (4) – Direct Connect gateway A sends the traffic to the primary VIF (VIF-1).
  • (5) – From the US West on-premises location, traffic flows over the on-premises backbone to the US East location.

On-premises to AWS traffic is forwarded symmetrically, following the same path.

Summary

Key considerations to help you select the architecture that fits your use case are:

  • Number of AWS Direct Connect POP locations and Regions:
    • Scenario 1 depicts AWS Direct Connect connections in one Region.
    • Scenarios 2 and 3 depict AWS Direct Connect connections in more than one Region.
  • If you have AWS Direct Connect connections in more than one Region, do you advertise a common IP prefix to AWS, or do you advertise specific prefixes? 
    • Scenario 2 shows a configuration for advertising specific prefixes from each on-premises location.
    • Scenarios 3 show configurations for advertising a common IP prefix from multiple on-premises locations.

Conclusion:

In this post, we outlined three global hybrid architectures showing how to connect multiple on-premises facilities to AWS across different Regions using AWS Direct Connect and AWS Cloud WAN. These architectures can be expanded to a larger number of AWS Regions and on-premises facilities following the same design patterns. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

About the authors

Jordan Rojas Garcia

Jordan Rojas Garcia

Jordan is a Networking Specialist Solutions Architect within the Worldwide Specialist Organization at AWS. He began his career in traditional Data Centre Networks and transitioned to AWS in 2018. At AWS, he specializes in designing cloud networking solutions and offers guidance and best practices on how to build networks in the AWS cloud. Beyond work, Jordan finds joy in traveling, exploring new culinary delights, hiking, and nurturing his passion for driving vehicles with two or four wheels.

Alexandra Huides

Alexandra Huides

Alexandra Huides is a Principal Networking Specialist Solutions Architect within Strategic Accounts at Amazon Web Services. She focuses on helping customers build and develop networking architectures for highly scalable and resilient AWS environments. Alex is also a public speaker for AWS, and is helping customers adopt IPv6. Outside work, she loves sailing, especially catamarans, traveling, discovering new cultures, and reading.

Mohamed Motasem

Mohamed, a Solution Architect at AWS, brings over 18 years of experience as a systems and network engineer, focusing on building large-scale networks for enterprises and media broadcasting. As part of the Cloud Optimization Success team, he concentrates on optimizing customer workloads and ensuring they are well-architected on AWS. Beyond work, he enjoys the challenge of various fishing techniques, including fly fishing, trolling and deep-sea.