Hybrid security inspection architectures with AWS Cloud WAN and AWS Direct Connect
AWS Cloud WAN makes it easy to build and operate wide area networks that connect your data centers and branch offices, as well as your Amazon Virtual Private Clouds (VPCs). With Cloud WAN, you connect to AWS through your choice of local network providers, then use a central dashboard and network policies to create a unified network that connects your locations and network types. For hybrid connectivity, you can also use AWS Transit Gateway and AWS Direct Connect to securely connect your cloud resources with their on-premises data centers. Transit Gateway connects your VPCs and on-premises networks through a central hub, acting as a highly scalable cloud router. Whereas, Direct Connect provides a private network connection that follows the shortest path to your AWS resources over the AWS global network.
In a previous blog post, Inspecting network traffic between Amazon VPCs with AWS Cloud WAN, we covered centralized architectures for native East-West (VPC-to-VPC) inspection both within and across Regions with Cloud WAN. In this blog post (part 2) we will cover hybrid traffic flows and security inspection architectures with Cloud WAN.
Network segments play an important role in how traffic gets routed through the core network and between Cloud WAN attachments. Segments are dedicated routing domains. Hence, by default, only the attachments within the same segment can communicate. Using network segments, you can divide your global network into separate isolated networks. For example, isolate traffic between production and development, or between business units.
To get the most out of this post, you should be familiar with the following Cloud WAN components:
- Global network
- Core network
- Core network policy
- Core Network Edge
- Network segment
Hybrid Traffic Flow Architectures with Cloud WAN and Direct Connect
Before we look at the hybrid security inspection architectures, it is important to understand the network traffic flows between on-premises and VPCs connected using Cloud WAN, Transit Gateway, and Direct Connect. Today, in order to establish hybrid connectivity using Direct Connect between on-premises and AWS Cloud, the Direct Connect link must terminate at a Transit Gateway that is then peered with Cloud WAN. Peering connections allow you to interconnect your core network edge with a Transit Gateway in the same Region. Peering connections between Cloud WAN and transit gateways support dynamic routing with automatic exchange of routes using Border Gateway Protocol (BGP). You can use route table attachments on the peering connection to selectively exchange routes between a specific transit gateway route table and a Cloud WAN network segment for end-to-end segmentation and network isolation. The peering connection supports policy-based routing to implement segment isolation across peering connections. Using this capability, routes are selectively propagated between a route table in the transit gateway and a core network segment. Refer to the Cloud WAN documentation on peering and routing for further details.
For customers accessing critical workloads over Direct Connect, we strongly recommend configuring for maximum resiliency with redundant Direct Connect connections terminating on separate devices in more than one colocation facility. At the least, consider having one connection each at two different colocation facilities to achieve high resiliency. In this blog, we take high resiliency configuration as an example and explain traffic flows for the following two scenarios:
- Active/Active: Both on-premises locations are using Direct Connect to connect with a single Direct Connect gateway (DXGW), and advertising the same prefixes and BGP attributes toward AWS Cloud WAN through Transit Gateway (peered with Cloud WAN). No BGP traffic engineering is applied.
- Active/Passive: Both on-premises locations are connected using Direct Connect to a single DXGW and advertising the same prefixes. By using BGP traffic engineering, one path is configured as active/primary and the other path as passive/secondary toward on-premises locations through the Transit Gateway (peered with Cloud WAN) and DXGW.
Note that setup and configuration of Transit Gateway and Direct Connect to establish hybrid connectivity, and Cloud WAN peering with Transit Gateway, are important topics, but we do not cover them in this post.
Note that a DXGW is a global resource that can terminate virtual interfaces from Direct Connect locations homed to multiple AWS Regions. There are a few use cases where you could use more than one DXGW, such as:
- To hairpin traffic through a device in your data center for VPC-to-VPC connectivity.
- To hairpin traffic through a Transit Gateway for Data Center-to-Data Center connectivity. This can now also be done using the Direct Connect SiteLink feature.
- To work around some Direct Connect quotas, for example, Transit gateways per AWS Direct Connect gateway.
BGP traffic engineering (achieving Active/Active or Active/Passive) between virtual interfaces works when they terminate at a single DXGW and therefore, we use a single DXGW in this blog to show the traffic flows. To learn more about BGP traffic engineering over Direct Connect, refer to the Creating active/passive BGP connections over AWS Direct Connect blog post.
Scenario 1 – Both hybrid connections are Active/Active
In this scenario, both Data Centers 1 and 2 are connected via Direct Connect and advertising the same 0.0.0.0/0 prefix. No BGP traffic engineering has been applied to either Direct Connect connection. We show this in the following diagram (Figure 1a).
(A) Each VPC in the Region advertises its local VPC CIDR to Cloud WAN CNE using their respective VPC attachment (Prod VPC 1 in Region 1: 10.1.0.0/16 and Prod VPC 2 in Region 2: 172.20.0.0/16). Both Prod VPC 1 and Prod VPC 2 are attached to the Production Segment. Both VPC Route tables point to Cloud WAN as the next hop destination.
(B) Data Center 1 connected to Direct Connect Location 1 (Associated Home Region) and Data Center 2 connected to Direct Connect Location 2 (Associated Home Region) are both advertising a default route (0.0.0.0/0) to the DXGW.
(C) Both Transit Gateways 1 and 2 are peered with Cloud WAN and attached to Hybrid Segment.
(D) Both Cloud WAN Production and Hybrid Segments are shared with each other via segment-sharing. As a result, both Cloud WAN segments learn all routes from other segments.
(E) Since the Production segment is shared with the Hybrid segment, to which both Transit Gateway 1 and 2 are attached, both Transit Gateway 1 and 2 will learn Prod VPC 1 10.1.0.0/16 Region and Prod VPC 2 172.20.0.0/16 Region 2 CIDRs using their respective local Region Cloud WAN peering attachment.
(F) Both Transit Gateways 1 and 2 learn the on-premises prefix 0.0.0.0/0 through DXGW over BGP.
(G) DXGW Allowed Prefix List for Transit Gateway 1 during association is set up to include Prod VPC 1 CIDR 10.1.0.0/16 in Region 1. Similarly, DXGW Allowed Prefix List for Transit Gateway 2 during association is set up to include and Prod VPC 2 CIDR 172.20.0.0/16 in Region 2.
(H) DXGW advertises both Prod VPC 1 10.1.0.0/16 and Prod VPC 2 172.20.0.0/16 CIDRs back to on-premises via BGP.
The following steps describe a packet walkthrough (Figure 1b):
1) Traffic from Prod VPC 1 Region 1 10.1.0.0/16 destined to on-premises CIDR: 0.0.0.0/0 gets routed via CNE 1 attachment in Region 1 to Cloud WAN. Cloud WAN routes the traffic to Transit Gateway 1 using the preferred local Region 1 peering attachment.
2) Similarly, traffic from Prod VPC 2 Region 2 172.20.0.0/16 destined to on-premises 0.0.0.0/0 gets routed via CNE 2 attachment in Region 2 to Cloud WAN, Cloud WAN routes the traffic to Transit Gateway 2 using the preferred local Region 2 peering attachment. Since Cloud WAN Production segment is shared with Hybrid Segment, Transit Gateway 1 and 2 will both learn Prod VPC 1 Region 1 CIDR 10.1.0.0/16 and Prod VPC 2 Region 2 CIDR 172.20.0.0/16.
3) Both Transit Gateways 1 and 2 are attached to the same DXGW. Transit Gateway 1 sends CIDR 10.1.0.0/16 and Transit Gateway 2 sends 172.20.0.0/16 to the DXGW.
4) DXGW will route traffic from Prod VPC 1 Region 1 CIDR 10.1.0.0/16 to Data Center 1 0.0.0.0/0 connected to Direct Connect Location 1 (Associated Home Region) (Green dashed line in Figure 1b).
5) DXGW will route traffic from Prod VPC 2 Region 2 CIDR 172.20.0.0/16 to Data Center 2 0.0.0.0/0 connected to Direct Connect Location 2 (Associated Home Region) (Green dashed line in Figure 1b).
Return traffic follows a similar path in the opposite direction (Red dotted line in Figure 1b).
Scenario 2 – Hybrid connections are Active/Passive
In this scenario, both Data Center 1 and 2 are connected using Direct Connect and advertising a default route (0.0.0.0/0). BGP traffic engineering has been applied to Data Center 1 to make it the preferred path using BGP Communities. We show this in the following diagram (Figure 2a).
Steps (A) and (C) thru (K) shown in Scenario 1 apply to Scenario 2 (Figure 2a) as well and are the same. The only difference is in step (B) where we apply BGP traffic engineering using communities.
(B) Data Center 1 is connected to Direct Connect Location 1 and is advertising 0.0.0.0/0 prefix with 7224:7300 High Preference BGP community using DXGW. Similarly, Data Center 2 connected to Direct Connect Location 2 and is advertising 0.0.0.0/0 prefix with 7224:7100 Low Preference, BGP community using DXGW. This makes the Data Center 1 path active and Data Center 2 path passive for return traffic.
The following steps describe a packet walkthrough (Figure 2b):
Traffic follows the same path as described in Active/Active section till it reaches direct connect gateway. When traffic arrives at direct connect gateway:
1) Due to 7224:7300 High Preference BGP community and being the preferred path, DXGW will route traffic from both Prod VPC 1 Region 1 CIDR 10.1.0.0/16 and Prod VPC 2 Region 2 CIDR 172.20.0.0/16 to Data Center 1 0.0.0.0/0 connected to Direct Connect Location 1 (Green dashed lines in Figure 2b). Data Center 2 0.0.0.0/0 remains as the passive path due to 7224:7100 Low Preference BGP community applied to that path (Gray line in Figure 2b)
2) If the Data Center 1 network path became unavailable, DXGW will use the passive/backup path to Data Center 2 due to 7224:7100 Low Preference BGP community (Gray line in Figure 2b)
Return traffic follows a similar path in the opposite direction (Red dotted line in Figure 2b).
Hybrid Inspection Architecture
Now that we understand how network traffic gets routed in a hybrid network, let’s look at how to inspect and protect your traffic as it moves between your premises and Amazon VPCs.
The diagrams that follow (Figure 3a and Figure 3b) show hybrid inspection architectures. We built this architecture on top of Scenario 1: Active-Active Configuration. While we focus on inspecting and protecting the traffic in an Active-Active scenario, the steps remain the same for Active-Passive configuration.
Since the intent here is to inspect the traffic flowing between on-premises resources and Amazon VPCs, besides Production Segment and Hybrid Segment, we create an Inspection Segment. We also create Inspection VPC A and Inspection VPC B, one for each Region, Region 1 and 2 respectively. These Inspection VPCs are then mapped to the global Inspection Segment.
To allow on-premises resources to communicate with Amazon VPC resources through a firewall, using the core network policy parameter segment-action, we share the Inspection Segment with Production Segment and Hybrid Segment. We show this as (1) Segment Sharing in Figure 3a and Figure 3b.
The following steps describe a forward traffic packet walkthrough when instance 1 in Prod VPC 1 communicates with an on-premises resource:
(A) When Instance 1 in Prod VPC 1 starts a connection to an on-premises resource, it does a VPC (App Subnet) route table lookup. The packet matches the default route entry with the Core Network ARN as the target and the packet gets routed to the Core Network.
(B) When the packet arrives at the core network, because Prod VPC 1 is associated with the Production Segment, it does a Production Segment Route Table lookup. The packet matches the default entry with two next hops: Inspection VPC A attachment and Inspection VPC B attachment. Inspection VPC A attachment is preferred (local to the Region) and the packet gets routed to Inspection VPC A.
(C) When the packet arrives at the Inspection VPC A attachment, it does a VPC (CWAN Subnet) route table lookup. The packet matches the default route with Firewall Endpoint 1 as the target and the packet gets routed to a firewall, through the firewall’s endpoint, for inspection.
(D) The firewall inspects the traffic, compares it to its security policy, and allows it through. The firewall routes the packet back to the firewall’s endpoint, where it does a VPC (Firewall Subnet) route table lookup. The packet matches the default route entry with the Core Network ARN as the target, and the packet gets routed to the Core Network.
(E) When the packet arrives at the core network, because Inspection VPC A is associated with the Inspection Segment, it does an Inspection Segment Route Table lookup. The packet matches the on-premises CIDR entry with two next hops: Transit Gateway 1 attachment and Transit Gateway 2 attachment. Transit Gateway 1 Attachment is preferred (local to the Region) and the packet gets routed to Transit Gateway 1 Attachment.
(F) When the packet arrives at Transit Gateway 1, it does a Transit Gateway 1 Hybrid Route Table lookup. The packet matches the on-premises CIDR entry with a Direct Connect gateway as the target and the packet gets routed to a direct connect gateway.
(G) When the packet arrives at the Direct Connect gateway, there are two next hops: Data Center 1 and Data Center 2. Data Center 1 is preferred because it is local to the Region.
Return traffic (shown in Figure 3b below) follows the same path in the opposite direction.
When instance 3 in Prod VPC 2 in Region 2 communicates with an on-premises resource, all the steps mentioned above (A) – (G) remain the same, except packet traverses through firewalls in Inspection VPC B and Transit Gateway 2 in its own Region.
- In order to keep the traffic flows symmetric for stateful network appliances, you must enable Cloud WAN appliance mode on the inspection VPC attachment for hybrid security inspection.
- Since Prod VPC attachments mapped to the Production Segment are not isolated and Prod VPCs can communicate with each other without traversing through the firewall, you only need one Inspection Segment. We show this in Figure 3. For a use case that requires east-west traffic inspection between VPCs that are mapped to same or different segments, refer to the Inspecting network traffic between Amazon VPCs with AWS Cloud WAN blog post.
- Cloud WAN, Transit Gateway and Direct Connect support both IPv4 and IPv6.
- As a best practice, use a separate small CIDR, for example /28, for each core network attachment so that you have more addresses available to use with other VPC resources. Keep the inbound and outbound network ACLs associated with the core network attachment subnets open.
- As a best practice and to achieve a deterministic routing between Cloud WAN and on-premises, use a separate Transit Gateway per Region to terminate the Direct Connect connections.
- Although the Cloud WAN Core Network Edge (CNE) is represented by a single endpoint per subnet/Availability Zone (AZ) in our diagrams, it is highly available and based on AWS Hyperplane. The same is true for the Transit Gateway.
- Inspection architectures shown in this post equally apply to security appliances deployed behind Gateway Load Balancer (GWLB) or AWS Network Firewall. For simplicity, the diagrams in this post only show the endpoints.
- Refer to Cloud WAN Route evaluation, Direct Connect Routing policies and BGP communities, and Transit Gateway Route evaluation in order to understand how each service evaluates routes and in what order.
- Refer to the Cloud WAN, Transit Gateway, and Direct Connect quota documentation to understand service quotas as they apply to each service.
- A Direct Connect gateway is a globally available resource. You can create the Direct Connect gateway in any Region and access it from all other Regions, except AWS China Regions.
- It is always recommended to have redundant Direct Connect connections from at least two diverse colocation facilities for high resiliency. Refer to AWS Direct Connect Resiliency Recommendations for details.
In this blog post, we discussed hybrid traffic flows with Direct Connect, along with security inspection architecture patterns for hybrid connectivity using Cloud WAN, Transit Gateway, and Direct Connect. Cloud WAN makes it easy to build and operate wide area networks that connect your data centers and branch offices, as well as your Amazon Virtual Private Clouds (VPCs). With Cloud WAN, you connect to AWS through your choice of local network providers, then use a central dashboard and network policies to create a unified network that connects your locations and network types.
For more information, refer to the following resources:
- AWS Cloud WAN documentation
- AWS Transit Gateway documentation
- AWS Direct Connect documentation
- AWS Cloud WAN traffic inspection blogs: