Designing resilient ground and cloud networks for SMPTE-2022-7 video workflows

The purpose of this post is to create a blueprint for a fault-tolerant network between ground networks and your Amazon Virtual Private Cloud (Amazon VPC) for 24/7 broadcast environments. This design guide is vendor-neutral for on-premises networking.

Components:

On-premises networking
Amazon VPC
AWS Direct Connect

Overview

From the time the first pair of 2022-7 packets leaves the contribution encoder (referred throughout this document as “the sender”) until they reach their destination (referred throughout this document as “the receiver”), we need to influence the path so they never go through the network infrastructure. We can control this with our network design from the first-hop at the access layer until we reach the edge of our VPC.

On-premises networking

To create diverse paths at the access layer, the sender needs to be configured with two interfaces, each in its own subnet and VLAN. Each of the sender’s interfaces needs to be patched into separate access-layer devices, a “path A” first-hop router and a “path B” first-hop router.

First-hop path diversity

Static routes may be required to direct the flows to the appropriate first-hop router. In this example workflow, the two flow destinations are in subnets 10.10.92.0/23 and 10.10.94.0/23. To direct one flow to each first-hop router, the following static routes need to be configured:

10.10.92.0/23 via 10.0.10.1 on device eth1
10.10.94.0/23 via 10.0.20.1 on device eth2

Between the first-hop routers and the network’s edge, be sure that the paths meet the following criteria:

Physical diversity – both paths should never transit through the same routers or fiber.
Undersubscription – the total bandwidth of traffic expected to transit the paths should not exceed the bandwidth of any link in the path.
Pre-determined – the path taken by each flow should be pre-determined and verifiable, even if you use a dynamic routing protocol. For this reason, using overlay networking or tunneling is discouraged as it significantly complicates meeting these criteria and future troubleshooting.

On-premises path diversity

Amazon VPC

Ideal VPC configuration

Similar to the ground network configuration, the VPC should have two distinct Classless Inter-Domain Routing (CIDR) blocks. It is critical to understand that this is different from creating subnets. Direct Connect Gateways advertise entire CIDR blocks, not individual subnets. You need to configure Border Gateway Protocol (BGP) import filters later on based on the CIDR blocks, or else your Direct Connects won’t have path diversity. Additionally, it is important to work with your AWS Solutions Architect (AWS SA) and your network carriers to guarantee that path diversity exists from your location to and from AWS. We have specific AWS Direct Connect Point of Presence (POP) recommendations for guaranteed diversity to associated Regions. Please connect with your AWS SA to discuss how to build diverse systems.

Configured properly, this allows you to create Amazon Elastic Compute Cloud (Amazon EC2) instances or AWS Elemental Media Services configurations with two elastic network interfaces in the same Availability Zone but diverse CIDR blocks.

Configuration instructions

In the AWS Management Console, create a new VPC. Initially the wizard allows you to add a single CIDR block. After it is created, you must select your VPC and in the “Actions” menu, select Edit CIDRs and create the additional CIDR.

Adding an additional CIDR Block

Once you save this configuration, you see “2 CIDRs” on your VPC overview page.

VPC overview with two CIDR blocks

Next, create a subnet in each Availability Zone (AZ) in each CIDR. For a Region that has four AZs, you end up with eight subnets.

Configured Subnets

Create a virtual private gateway and attach it to your VPC. For most customers, leaving the default Amazon Standard Identification Number (ASIN) is fine.

Attached VGW

Finally, locate the routing table for your VPC and enable propagation from your virtual private gateway.

Attached VGW

At this point, your VPC is configured and ready. Before launching instances, you may want to stage additional configuration for your specific needs, including modifying DHCP options sets, or staging security groups to be used later.

AWS Direct Connect

The final piece of the design is to connect the properly configured ground and cloud networks. Generally speaking, the design will adhere to the AWS Direct Connect Resiliency Recommendations guide, Maximum Resiliency option. In addition to this, selecting the right Direct Connect POPs where AWS guarantees path diversity on top of resiliency is important.

Work with your AWS Account Manager or Solutions Architect to determine which two Direct Connect locations are geographically proximate to your location and can provide path diversity (zero shared physical infrastructure between the POP and the AWS data center).

Once you select two Direct Connect locations, source service providers for the point-to-point circuits to connect your on-premises routers or firewalls to the Direct Connect handoff. Ensure that these service providers do not share physical pathways to avoid total loss of connectivity in the event of a fiber cut or similar event. Note that the process of securing these circuits often takes several months from initial quote to completed construction.

You can now begin configuring your Direct Connects in the AWS Management console. The service providers need the Letter of Authorization generated in these steps to complete the connection.

In the Direct Connect service in your console, create a new Maximum Resiliency connection with the connection wizard.

Creating a Maximum Resiliency Direct Connect Order

Select the bandwidth for each site, the locations you to be used, and the service providers you selected. Note that your “Bandwidth” is the speed of a single link at each location. This configuration creates four 10-Gbps links. The bandwidth you select should exceed the total bandwidth required for one of the diverse paths of your 2022-7 flows.

Configuring the Direct Connect order connection settings

Finally, review your configuration choices and complete the order.

Confirming the order details

The next screen allows you to download all of the Letters of Authorization (LoA) to provide to the service providers.

Download LoAs

Hand off the LoAs to the appropriate service providers and wait for them to complete your cross connections.

Once complete, configure your ground-to-cloud connectivity.

First, create a Direct Connect Gateway. This is the object that connects your four Direct Connects to the VPC.

Download LOAs

Next, associate the Direct Connect Gateway with the Virtual Private Gateway you attached to your VPC.

Associating the Direct Connect gateway with the VGW

Next, configure private virtual interfaces for each of the four Direct Connects. In the configuration settings, ensure they are connected to the gateway you just created.

Creating the virtual interfaces

In the additional settings, manually select the IP addresses of the virtual interfaces and the interfaces that are configured on your edge device, as well as manually setting the BGP authentication key. For IP address management purposes, choosing your own IPs is recommended.

created virtual interface

Download sample configuration from the Actions menu for your common vendors, or configure BGP peering yourself based on the values shown here.

When the BGP session is configured successfully, the BGP status flips from down to up. Note that there may be a delay of several minutes before the status updates.

Finally, you are ready to enforce path diversity over your Direct Connects. In order to do this, your edge routers must filter the incoming BGP advertisements from the Direct Connects. As illustrated in the following diagram, Customer Edge Firewall 1 receives advertisements for both CIDR blocks, 10.10.92.0/23 and 10.10.94.0/23. Configure it to import 10.10.92.0/23, but reject 10.10.94.0/23. To enforce path diversity in the other direction, Customer Edge Firewall 1 must advertise only one of the sender’s subnets, in this example 10.0.10.0/24.

Configure the other edge device to import the remaining routes and advertise the other sender subnet.

Inspect the routing tables on the edge devices and in your VPC to confirm that all of the routes are propagating as expected.

If the edge devices and your VPC routing table show all of the expected routes, your configuration is complete and you can begin failover testing and (finally!) production workflows in the environment.

completed 2022-7 diverse path topology

Additional notes on the workflow

From a network engineering perspective, it can seem counterintuitive to intentionally reduce the number of available paths in an effort to guarantee diversity. However in practice, this can be preferable. If both copies of the 2022-7 flow are routed through a device that shuts down in an ungraceful manner, both flows lose at least the number of packets on the wire at the time of the crash, in addition to any packets that cannot be buffered while the routing protocol selects the next-best path. Because both flows briefly disappear, the 2022-7 receiver is unable to recover any packets and there is a visible disruption to the content.

Traditional redundancy scenario

During the time the dynamic routing protocol recovers, you may lose tens or hundreds of packets (worse, if your routing protocol is not tuned properly). The visual interruption likely lasts only a few frames.

If the same failure occurred in the fully diverse network design, you would likely lose millions or tens of millions of packets, but there would be no visual disruption because of the ability of the 2022-7 receiver to select any missing packets from the non-impacted flow.

2022-7 redundancy scenario

In this post, we shared a blueprint for a fault-tolerant network between ground networks and your VPC for 24/7 broadcast environments. This design guide is vendor-neutral for on-premises networking.

If you have questions, feedback, or would like to get involved in discussions with other community members, visit the AWS Developer Forums: Media Services.

AWS for M&E Blog