Automate Networking foundation in multi-account environments
As AWS customers adopt multi-account strategies, they need to have cross-account networking in their AWS environment. They also need to extend their network across multiple AWS Regions when creating multi-Region applications or disaster recovery environments. AWS has many services and features that allow you do to exactly that with great flexibility. But for users that want to get started with a networking foundation, or are only running tests, an automated way of creating a simple multi-account, multi-Region network is helpful. The solution presented in this post uses AWS Infrastructure as Code (IaC) and AWS Organizations’ capabilities to automate the setup of an interconnected hub and spoke network. It uses the reference architecture described at the 2019 re:Invent conference, AWS Transit Gateway reference architectures for many VPCs, to build a scalable multi-VPC environment across accounts and Regions. This approach automatically provisions VPCs that are connected to a central AWS Transit Gateway using non-overlapping Classless Inter-Domain Routing (CIDR) ranges.
Figure 1: Network infrastructure provisioned
The diagram shown in Figure 1: Network infrastructure provisioned depicts the multi-VPC architecture created by this solution, spanning across multiple Regions and multiple accounts. It uses the best practices of setting up common networking resources in a shared Networking Account, shown as the Admin account. In each specified region, VPCs are created in the accounts under specified organizational unit (OU) and are connected with a Transit Gateway as its hub and each VPC as a spoke. The regional transit gateways are interconnected using Transit Gateway peering to allow cross-region communication.
Future accounts added to that OU are provisioned and connected automatically. This solution takes advantage of AWS Organizations and CloudFormation StackSets integrations for that purpose.
Small private subnets are created and used to connect each VPC with the corresponding Transit Gateway using VPC attachments. By default, no additional resources are created inside the VPC. This allows individual account owners the freedom to set up the infrastructure in the VPCs as needed for their workloads.
Pre-requisites for Deployment
You’ll need to make sure the following pre-requisites are true before deploying the solution:
- You have a Networking Account created to host all the shared networking resources, as per landing zone best practice.
- The Networking Account is registered as a delegated administrator account for managing CloudFormation StackSets. This allows you to spin up the solution in an account other than the master account (in this case, it will be the Networking Account).
- Enable Trusted Access with CloudFormation Stacksets in AWS Organizations. This allows the CloudFormation StackSets to deploy across the Organization with service-managed roles.
- Enable organization level sharing in AWS Resource Access Manager (RAM) in the management account. This allows Transit Gateway to be shared cross-account, so VPCs can be attached to it and communicate with each other.
Deploying the solution
In order to deploy the solution, sign in to the AWS Management Console and navigate to the CloudFormation console. Then, using the Networking Account of your AWS environment, deploy the solution template using Create Stack.
The solution takes following parameters:
- NetworkCidr: the CIDR used to create a network spanning across selected Regions and OU.
- OrganizationUnitArn: The Amazon Resource Name (ARN) of the target OU. All accounts in that OU will be a part of the network.
- SecondaryRegions: In case of a multi-Region network, you can add additional Regions as a comma delimited list. Leave it empty for a single-Region network.
- BaselineTemplate: This is an optional template to run on provisioned accounts to set up the VPC
This creates a network using the NetworkCidr spanning across accounts within OrganizationlUnitArn. If SecondaryRegions are specified, additional AWS Transit Gateways are created in each Region and connected with the central AWS Transit Gateway in the primary Region (the region used to launch the solution).
Using the BaselineTemplate parameter, you can to define the URL of a custom CloudFormation. The solution is provisioned alongside the VPCs in each member account and region. This can create additional resources such as additional subnets, NAT gateways, and others.
A default baseline is provided as an example to aid in testing. It adds the required VPC endpoints for Session Manager, creates a private subnet, and configures an EC2 instance so you can test by pinging instances on other VPCs.
If you use the example template, the one set by default as the BaselineTemplate, you can test the network created with no additional effort. Using it, you should be able to ping an EC2 instance in one account and Region, from another EC2 instance in a different account and/or Region for the test. In order to do so, follow these steps:
- Login to one of the AWS Account that was part of the OU and navigate to EC2 Console.
- Locate the EC2 Instance deployed in the VPC setup by this solution. The name of the VPC will be same as the name of the CloudFormation Stack.
- Note the “Private IPv4 Address” of the EC2 instance.
- Chose a different account and/or Region and repeat steps 1 and 2, but instead of copying the IP address, use Session Manager to log into the instance remotely. With the instance selected, click “Connect” and connect using “Session Manager”
- Once the console session starts, ping the other instance IP address you noted in Step 3.
The ping test should run successfully across all instances spread across multiple accounts and regions. This shows that this CloudFormation solution deployed a multi-account and multi-Region network you can use for your applications, taking away the need for manually configuring the network.
How it works
Figure 2: The CloudFormation resource creation process
The most basic configuration is a single-Region network. This occurs when the SecondaryRegions and BaselineTemplate parameters are empty. The preceding diagram (Figure 2: CloudFormation resource creation process) shows the sequence of events in that scenario. This is a two-step process:
- Deploying a nested CloudFormation stack in the primary Region of the Networking Account to setup the central transit gateway.
- A StackSet targeting member accounts in the defined OU and Region. This StackSet creates a VPC and connects it with the central transit gateway.
Let’s take a closer look into what the nested Stack creates in the Networking account:
- A Transit Gateway with auto-accept enabled.
- A resource share in RAM that shares the TGW with the specified OU.
- An Amazon DynamoDB table holds the Network CIDRs used for CIDR allocation across many VPCs.
- A Lambda function that is used by CloudFormation Custom Resources
- An InitMetadata Custom Resource that initializes the DynamoDB Table with metadata (more details follow below).
- An IAM Role to allow accounts from the specified OU to access the DynamoDB Table.
It created the StackSet with Service Managed permissions that are achieved by enabling Trusted Access with AWS Organizations. The StackSet instantiates Stack Instances in all accounts under the specified OU and creates the following relevant resources:
- A Lambda Function that CloudFormation’s Custom Resources will use.
- A MemberRegistration Custom Resource that registers the account in the DynamoDB Table and generates an assigned CIDR used to create the VPC (more details follow below).
- A MemberMetadata Custom Resource that retrieves information about the network it is part of (that is, the ID for the Regional Transit Gateway it should connect to, the NetworkCidr, etc.).
- A VPC with the non-overlapping CIDR generated.
- An AzSubnets Custom Resources creates a minimal private subnet (28 bitmask) in each AZ for that Region.
- A VPC Attachment connecting the shared TGW with the newly created VPC.
- A Default route in VPC’s default route table, pointing to TGW VPC attachment.
The metadata initialization calculates the bitmask that VPCs should use when defining their CIDR. It is calculated by taking the specified NetworkCidr and figuring out what bitmask would allow it to be split into at least the defined maximum number of accounts (Max accounts is 16 by default, edit CloudFormation template file to customize it).
Each VPC that is created is assigned a CIDR based on the calculated VPC bitmask and a uniquely generated ID. The ID is generated using DynamoDB’s atomic counter to make sure its VPCs are created in parallel, with no concurrency issues.
For example, a NetworkCidr of 10.0.0.0/16 would result in a VPC bitmask of 20, so VPC CIDRs would look like 10.0.0.0/20, 10.0.16.0/20, etc. A NetworkCidr of 10.0.0.0/8 would result in a VPC bitmask of 12, but since the largest VPC CIDR allowed is 16, VPC CIDRs would look like 10.0.0.0/16, 10.1.0.0/16, etc. That means that small NetworkCidr’s or large MaxAccounts result in fewer VPCs.
You can optionally define a baseline template URL that is used to set up the workload infrastructure inside the VPCs. This creates an additional StackSet that is provisioned after the VPC is created. The default value for BaselineTemplate parameter points to a template that creates a subnet, EC2 instance, and sets up SSM Session Manager, so you can test connectivity. The example is based on a public CloudFormation template that automatically configures Session Manager and creates a EC2 instance. It shows how to adapt templates to leverage this solution’s components.
If you operate in multiple Regions and specify SecondaryRegions, the solution will create an additional StackSet targeting those Regions in the central account and the roles for completing that task. Here, the NetworkCidr is first divided into equal-size Regional CIDRs, enough to accommodate the total number of Regions. The VPC bitmask is then calculated against that Regional CIDR. Static routes are added to Transit Gateway route tables using the regional CIDRs to connect them. The bitmask for the member accounts is calculated against Regional CIDR, and VPCs are created within the Regional CIDR.
The solution described in this blog provides a template for creating a simple network blueprint for a multi-account, multi-Region network topology. Using CloudFormation and AWS Lambda helps automate the network setup, freeing up valuable time for network administrators. The goal is to abstract as much complexity as possible to make the user experience simple for a set of common use cases. You can further customize the solution to suit your company’s requirements for network architecture. Explore the template on CloudFormation console or download it and customize to your environment!