Networking & Content Delivery

Influencing Traffic over Hybrid Networks using Longest Prefix Match

Introduction

Many organizations use hybrid networks to connect on-premises data centers to the cloud. These networks often use both AWS Direct Connect and private WAN MPLS links to connect data centers to cloud resources and to each other. With multiple connections, organizations need to be able to control the path that network traffic will follow between endpoints. A previous blog post, Creating active/passive BGP connections over AWS Direct Connect, shows how BGP best path selection algorithms LOCAL_PREF and AS_PATH Prepending can be configured in a data center to create active/passive connections when using Direct Connect.  In this post, we will show how the Longest Prefix Match (LPM) routing algorithm can be used to control the network path from AWS to an on-premises data center and vice versa.

Hybrid Cloud Networks

Building Blocks

Direct Connect makes a private dedicated network connection between your network and the AWS global network backbone at over 100 Direct Connect locations around the world. In this post we use Direct Connect Gateway  and Transit Gateway as building blocks for our example networks. A Direct Connect Gateway can be associated with up to three Regional Transit Gateways. These associations allow the Direct Connect Gateway to establish connectivity between Direct Connect and Amazon Virtual Private Cloud (VPC) that are attached to the Transit Gateways. A Transit Gateway can in turn peer with Transit Gateways in other AWS Regions to provide inter-Region connectivity.

Scenario 1: Single Direct Connect Gateway

In Figure 1, there are two on-premises data centers; one located in the western United States (designated PHX), and another in the east (designated ATL).

Diagram depecting connectivity using a single Direct Connect Gateway

Figure 1: Connectivity using a single Direct Connect Gateway

Due to the proximity, PHX is connected to the US West (N. California) Region (us-west-1), and ATL is connected to the US East (N. Virginia) Region (us-east-1). The Transit Gateways located in each Region are peered with one another for Inter-Region connectivity. The diagrams in the blog show a single Direct Connect connection in both data centers in our examples, but AWS recommends following AWS Direct Connect Resiliency Recommendations for production environments.

Note that there are two possible paths from the AWS Regions to either data center. For example, let’s suppose we want to establish a network connection from PHX to US-West-1 as shown in Figure 2:

Diagram depicting establishing Connection from PHX to US-West-1

Figure 2: Establishing Connection from PHX to US-West-1

With a single Direct Connect Gateway, the return traffic can travel directly from the Direct Connect Gateway and then to the Direct Connect in PHX (Path A). Or it can travel to the Direct Connect location in ATL and then back through the MPLS connection (Path B) as shown in Figure 3:

Diagram depicting Two Possible Return Paths with Single Direct Connect Gateway

Figure 3: Two Possible Return Paths with Single Direct Connect Gateway

In Figure 3, we see the possibility of asymmetric routing where traffic initiated from an on-premises data center to an AWS Region can return via two possible paths. If traffic is initiated through the local Direct Connect but returns through the MPLS connection, it can be problematic for stateful firewalls that are located in one of the data centers that need to track the traffic flow in both directions.

With two possible paths from the Direct Connect Gateway, how do you influence traffic to know which path will be followed? It turns out, there is no simple way to control which path traffic will take from a single Direct Connect Gateway, as network administrators do not have direct access to a Direct Connect Gateway’s routing decisions.

Scenario 2: Dual Direct Connect Gateways

In order to influence traffic from AWS to our on-premises data centers, we need to deploy dual Direct Connect Gateways as shown in Figure 4.

Diagram depicting onnectivity using Dual Direct Connect Gateways

Figure 4: Connectivity using Dual Direct Connect Gateways

Here we have both Direct Connects connecting to AWS via separate Direct Connect Gateways respectively. With this design, we have two separate paths between AWS and the on-premises data centers. As shown in Figure 5, traffic originating from US-West-1 can reach PHX by traveling directly to the local Direct Connect (Path A).  Or it can take an alternative path, first to US-East-1 via Transit Gateway peering, then to Direct Connect in ATL, and then across the MPLS connection (Path B).

Diagram depicting Dual Paths from an AWS Region to a Datacenter

Figure 5: Dual Paths from an AWS Region to a Datacenter

The return traffic from PHX to US-West-1 can also take two possible paths as shown in Figure 6

Diagram depicting Dual Return Paths from Datacenter to AWS Region

Figure 6: Dual Return Paths from Data center to AWS Region

Given two possible network paths from an AWS Region to an on-premises data center, how do we control which path to choose? Customers typically prefer to use a data center’s local Direct Connect as its primary connection to and from the AWS Region due to performance and latency and use an alternate path only if the primary path is disrupted.

For example, let’s suppose that the local Direct Connect connection in PHX is disrupted. We would now like to make sure that a resource in US-West-1 can still reach a destination in PHX via the secondary path through US-East-1 and ATL as shown in Figure 7:

Diagram depicting Alternate Path from US-West-1 to PHX

Figure 7: Alternate Path from US-West-1 to PHX

How do we ensure that the traffic will take a secondary path only if the primary path is disrupted as in Figure 7?

Solution: Longest Prefix Match

There are multiple ways to do BGP traffic engineering with Direct Connect. In this example, we use the Longest Prefix Match (LPM) routing algorithm to influence path selection to and from AWS. In LPM, a router selects the longest/most specific IP prefix match from multiple routes in the route table to determine which outbound interface to send a packet.

The data centers in our diagram use Internal BGP (iBGP) to exchange routes with each other and External BGP (eBGP) to exchange routes with AWS over Direct Connect. The customer data centers are in their own autonomous system (ASN 65000). The Direct Connect Gateways in our example are assigned separate ASNs 65001 and 65002 respectively.

We can control the specificity of the routes in the US-West-1 Transit Gateway route table by controlling the length of IP prefixes advertised from the data center routers to AWS over Direct Connect. We configure BGP on the data center routers to advertise specific IP prefixes which are then propagated to the Transit Gateway associated with Direct Connect Gateway. The Transit Gateway in turn will choose the route entry in its route table with the longest prefix or most specific match.

AWS to Data Center

Going back to our example, let’s assume a resource in US-West-1 needs to reach a server in the PHX data center with destination IP address 172.16.0.101. The Transit Gateway route table in US-West-1 starts with this entry as shown in Table 1:

Destination (Prefix) Target (Next Hop)
172.16.0.0/16 Direct Connect Gateway-attach | 65001

Table 1: US-West-1 Transit Gateway Route Table

To send traffic to 172.16.0.0/16 in PHX, we route to the local Direct Connect in PHX. This route to 172.16.0.0/16 is dynamically advertised from the PHX router into the US-West-1 Transit Gateway route table via BGP. If we want an alternate route, we enter a second static route with the Transit Gateway peering attachment as the next hop as shown in Table 2:

Destination (Prefix) Target (Next Hop)
172.16.0.0/16 Direct Connect Gateway-attach | 65001
172.16.0.0/16 Transit Gateway-attach | Transit Gateway-us-east-1

Table 2: US-West-1 Transit Gateway Route Table

We now have two routes to the same IP prefix. The first is a dynamic route advertised through Direct Connect Gateway. The second is a static route to the other Transit Gateway via peering attachment. Normally, the static route would take priority over the dynamic route. However, in our case, we want to prioritize the path via Direct Connect Gateway over Transit Gateway. To do this, we need to advertise more specific routes from the customer router in PHX as shown in Figure 8.

Diagram depicting advertising more specific routes (/17) from PHX

Figure 8: Advertise more specific routes (/17) from PHX

This would result in the following routes in the US-West-1 Transit Gateway Route Table as shown in Table 3:

Destination (Prefix) Target (Next Hop)
172.16.0.0/17 Direct Connect Gateway-attach | 65001
172.16.128.0/17 Direct Connect Gateway-attach | 65001
172.16.0.0/16 Transit Gateway-attach | Transit Gateway-us-west-1

Table 3: US-West-1 Transit Gateway Route Table

Here we have converted the IP prefix 172.16.0.0/16 routes into two more specific /17 routes over the Direct Connect Gateway attachment. These /17 IP prefixes advertised from PHX will cover the same routes as the original /16 prefix, including our 172.16.0.101 destination.

The third route entry to the 172.16.0.0/16 prefix by way of the Transit Gateway peering attachment also matches our destination of 172.16.0.101. But it has a shorter IP Prefix and is less specific. Therefore, the US-West-1 Transit Gateway will prefer the more specific /17 paths via the Direct Connect Gateway as shown in Figure 9:

Diagram depicting using Longest Prefix Match to Select Primary Path

Figure 9: Use Longest Prefix Match to Select Primary Path

What would happen if the Direct Connect connection in PHX becomes disrupted as shown in Figure 10?

Diagram depicting PHX Direct Connect Disruption

Figure 10: PHX Direct Connect Disruption

If that were to occur, the PHX router would stop sending its BGP announcements to US-West-1. The /17 routes it had previously advertised would be withdrawn, and the US-West-1 Transit Gateway would failover traffic to the next matching 172.16.0.0/16 route over the Transit Gateway peering attachment as shown in Figure 11:

Diagram depicting Failover to Secondary Path

Figure 11: Failover to Secondary Path

Note that in the above scenario (Figure 11) for this traffic path to work, the ATL data center must also advertise the 172.16.0.0/16 prefix to US-East-1 via BGP.

Data Center to AWS

We can also use LPM to influence traffic in the reverse direction. Let’s suppose we want traffic originating from PHX to reach a destination resource in US-West-1 (10.0.0.0/16) with the local Direct Connect as the primary path and the MPLS as the secondary. The destination resource is in a VPC with IP address of 10.0.0.101.

In this case, we want the routes advertised from the PHX Direct Connect Gateway to be more specific than the routes advertised across MPLS. To do this, we go to the AWS console and enter 10.0.0.0/17 and 10.0.128.17 as the IP prefixes to be advertised from AWS to PHX via Allowed prefixes under Direct Connect gateway association as shown in Figure 12:

Diagram depicting using LPM to prefer local Direct Connect as Primary to US-West-1

Figure 12: Use LPM to prefer local Direct Connect as Primary to US-West-1

The resulting route table in PHX customer router would then look as shown in Table 4:

Destination (Prefix) Target (Next Hop)
10.0.0.0/17 Direct Connect
10.0.128.0/17 Direct Connect
10.0.0.0/16 MPLS

Table 4: PHX Route Table

Because the routes using the /17 prefix are more specific, the PHX router will use its local Direct Connect as its primary path to reach the 10.0.0.101 destination in US-West-1. If the local Direct Connect in PHX is disrupted, the PHX router will failover to the next matching path (10.0.0.0/16) over MPLS, and then through US-East-1 to reach its destination in US-West-1 as shown on Figure 13:

Diagram depicting Failover to MPLS to reach US-West-1

Figure 13: Failover to MPLS to reach US-West-1

The return traffic from US-West-1 would follow the same path in the reverse direction as shown in Figure 14:

Diagram depicting Return Traffic from US-West-1 to PHX

Figure 14: Return Traffic from US-West-1 to PHX

One thing to remember is that BGP is used to advertise routes between the AWS and the data centers but not between the Transit Gateway peers. Therefore, static routes need to be manually entered into each Transit Gateway in order to route traffic between Transit Gateway peers.

Summary

Global organizations use hybrid connectivity models to connect on-premises data centers to cloud resources around the world. These hybrid networks can provide multiple paths between destinations using both Direct Connect and private WAN links such as MPLS. In this post, we showed how Longest Prefix Match can be used to influence which network path traffic will take between the AWS and on-premises destinations. We showed how routes to an IP prefix for example 172.16.0.0/16 can be broken into two more specific prefixes such as 172.16.0.0/17 and 172.16.128.0/17. By using more specific routes, we can influence the traffic path decisions.

Nolan Chen

Nolan Chen

Nolan Chen is a Solutions Architect at AWS, where he helps customers build innovative and cost-efficient solutions using the cloud. Prior to AWS, Nolan specialized in data security and helping customers deploy high performing wide area networks. In his spare time, Nolan enjoys reading history books while thinking about what the future might hold.

 

Rizwan Mushtaq

Rizwan Mushtaq

Rizwan is a Senior Solutions Architect at AWS, where he helps customers design innovative, resilient and cost-effective solutions using various AWS services. He holds a MS in Electrical Engineering from Wichita State University.