Understanding Amazon VPC from a VMware NSX Engineer’s Perspective

By Anuj Dewangan, Solutions Architect at AWS

With VMware Cloud on AWS, you can deploy applications in a fully-managed VMware environment that runs on the Software-Defined Data Center (SDDC) stack directly on bare-metal Amazon Web Services (AWS) infrastructure.

Organizations can simplify their hybrid IT operations by using the same VMware technologies—including vSphere, vSAN, NSX, and vCenter—across their on-premises datacenters and on the AWS cloud.

VMware Cloud on AWS also brings native integration with AWS infrastructure and platform capabilities such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Kinesis, and Amazon Redshift, among others. I recommend reviewing this blog post to understand native integration of AWS services with workloads deployed in VMware Cloud on AWS for centralized access, security, content acceleration, and data analytics.

If you are looking to integrate native AWS services with your enterprise applications, knowledge of Amazon Virtual Private Cloud (Amazon VPC) networking becomes especially important.

In this post, and my follow-up, I will explain the major components of Amazon VPC for engineers and architects who build and operate VMware NSX networks, and who are building solutions on VMware Cloud on AWS. I will explain VPC in terminology and concepts that are familiar to you as NSX experts. If you come from an NSX background like me and are new to Amazon VPC, please read on!

If you have a traditional network engineering background and would like to learn more about Amazon VPC, I recommend reading this blog series first.

Amazon VPC and VMware NSX

Through Amazon VPC, you can launch AWS resources in a virtual network in AWS regions. On the other hand, VMware NSX—starting with on-premises datacenters and now with VMware Cloud on AWS—enables a software-defined networking (SDN) solution to seamlessly extend on-premises networks to AWS. This allows VMware customers to migrate applications to AWS, create a disaster recovery solution, or extend the capacity of their datacenters.

Let’s look at the physical and logical components in an Amazon VPC, including the forwarding, control, and management planes.

Physical Infrastructure: AWS Regions and Availability Zones

NSX Transport Zones (TZs) determine the physical scope of networking components like Logical Switches and Distributed Logical Routers (DLRs) in an NSX network. TZs can include multiple vSphere clusters, such as compute and edge clusters. An NSX Transport Zone with multiple clusters is shown in Figure 1.

Figure 1 – NSX Transport Zone with compute and edge clusters.

A Logical Switch created in the Transport Zone will span all the vSphere hosts that are part of the clusters associated with the TZ. Similarly, a DLR associated with Logical Switches in this TZ will create a DLR instance on each of the vSphere hosts in the cluster.

As with NSX Transport Zones, while understanding the logical components in an Amazon VPC, it’s important to understand the physical scope of their existence within the AWS global infrastructure. Region and Availability Zone (AZ) constructs govern the physical scope of Amazon VPC components like ENI, subnets, route tables, security groups, VPC Gateways, and Network ACLs. Figure 2 shows the relationship between AWS Regions and AZs.

Figure 2 – AWS Regions and Availability Zones (AZs).

Each AWS Region is built in a separate geographic area and is completely independent from all other Regions. Each region has multiple, isolated locations known as Availability Zones, and each AZ has one or more physical datacenters that are fault isolated from all other AZs in the region. The AZs within a Region are connected through low-latency networking links. To achieve high availability, an application is deployed across multiple AZs.

Tenant’s Logical Network: Amazon VPC

For NSX, a logically isolated network for a tenant is characterized by one or more dedicated Logical Switches, typically one DLR and one or more Edge Service Gateways (ESGs). Figure 3 shows switching and routing plane components of the NSX network for a three-tier application deployment.

Figure 3 – NSX logical network for a three-tier application.

An Amazon VPC represents a virtual isolated network in the AWS cloud, and encapsulates all the networking components required to make communication possible within the VPC. Figure 4 shows an Amazon VPC and related components for a three-tier application deployment, and represents an equivalent construct to a tenant network.

Figure 4 – A three-tier application deployed in an Amazon VPC.

Just like an NSX tenant network can contain multiple subnets, Amazon VPC is also a container for multiple subnets. The scope of a VPC is a single AWS Region and spans all the AZs in that Region. Each Region in your AWS account gets a default VPC. You can also create your own VPC as described in this post.

You can use Internet connectivity, virtual private network (VPN), and AWS Direct Connect to connect your VPC networks to networks outside of AWS. VPC peering allows you to connect to other VPCs in your AWS account or other AWS accounts. Amazon VPC also provides private connectivity to several AWS and Partner services through VPC endpoints, allowing you to connect to these services without the need for Internet connectivity.

Additionally, Amazon VPC infrastructure includes a VPC DNS server—available at the second IPv4 address of the primary Classless Inter-Domain Routing (CIDR) range—to resolve both private and public hostnames. Having a built-in DNS server reduces the administrative burden for deploying workloads in an Amazon VPC.

VPC Addressing

An Amazon VPC is associated with a primary IPv4 CIDR, and you can add additional IPv4 CIDRs to extend the addressing space. Unlike IPv4 addresses in an NSX network, which can be Internet routable or private, all IPv4 addresses in an Amazon VPC are private and need to be mapped to public IPv4 addresses for Internet connectivity. You can still use IPv4 CIDRs outside RFC 1918/6598 address space in your Amazon VPC, but IPv4 CIDR addresses are never used to directly communicate with the Internet. IPv4 addresses need to have Network Address Translation (NAT) between private IPv4 addresses and public IPv4 addresses to enable IPv4 based Internet connectivity.

You can also add an IPv6 CIDR to your Amazon VPC. With IPv6, AWS assigns a globally routable /56 prefix to the Amazon VPC.

VPC Subnets

In an NSX network, an NSX Logical Switch represents a Layer 2 network and is associated with a subnet in the tenant network. In an Amazon VPC, you directly create subnets and the VPC control and forwarding planes enable communication between the subnet’s network interfaces. Amazon VPC users do not directly work with the underlying components which enable traffic flow within a subnet. The section in this post about Amazon VPC forwarding and control planes provides more details on forwarding architecture.

While creating a subnet, you will assign a unique IPv4 and optionally an IPv6 CIDR from the VPC CIDR range. The scope of a subnet in a VPC is within an Availability Zone, and there can be multiple subnets per AZ per VPC. From a workload perspective, for higher availability, application tiers are deployed across multiple subnets and multiple AZs, as is shown in Figure 4.

Subnets can be public or private. A public subnet has direct access to the Internet via a route to the Internet Gateway, while network interfaces in public subnets are mapped to public IPv4 addresses using the built-in 1:1 stateless NAT of the Internet Gateway. A private subnet does not have direct access to the Internet―it can connect to on-premises datacenters through the use of VPN/Direct Connect or to the Internet using VPC-based NAT. We will learn more about external connectivity for a VPC in my next post.

In the three-tier application shown in Figure 4, the web, app, and database tiers are deployed in private subnets, whereas the nodes for the Elastic Load Balancer (ELB) are deployed in public subnets, enabling direct access of incoming Internet traffic to the ELB nodes. Public subnets associated with the ELB nodes are not shown in Figure 4 for simplicity.

Elastic Network Interfaces (ENI)

Once you have created a subnet in the VPC, you can create and associate Elastic Network Interfaces (ENI) with the subnet. Because an ENI is associated with a subnet, the scope of the ENI is the same as that of the subnet, which is the AZ in which the subnet is created.

An ENI is conceptually similar to a VMware virtual network interface card (vnic). Just like vnics are associated with VMware Virtual Machines (VMs), ENIs are associated with Amazon Elastic Compute Cloud (Amazon EC2) instances, Load Balancer nodes, VPC interface endpoints, and instances of managed services like Amazon RDS.

VMware vnics, in an NSX network, are associated with VXLAN-backed distributed port groups which link to a Logical Switch. This, in turn, represents a network subnet. Similarly, each ENI is associated with exactly one subnet, and the ENI borrows its IP addresses from the subnet CIDR block.

Unlike vnics, ENIs encapsulate Layer 3 properties of the network interface. You can auto or manually assign IPv4 and IPv6 addresses to an ENI. You can also configure secondary IP addresses on an ENI—all IP addresses configured on an ENI must be in the same CIDR range as the associated subnet. These IP addresses remain assigned to the ENIs for the lifetime of the ENIs.

Another difference from vnics is that ENIs can exist independent of their association with instances. An ENI can be created through the AWS API, AWS Command Line Interface (CLI), or the AWS Management Console. ENIs can then be associated or disassociated with an instance. You can associate multiple ENIs with an instance and move an ENI to another instance as well. This is where the “elasticity” of the network interface becomes evident.

Each IPv4 address in an ENI can be mapped with a public or Elastic IP address if the ENIs are in a public subnet and need to exchange traffic directly with the Internet. Public IP address and Elastic IP address are Internet-routable IPv4 addresses that allow an instance with a private IP address to communicate with the Internet. Because the IPv6 CIDRs are publicly routable, there is no need for any IP address mapping. As long as the subnet is public, IPv6 Internet connectivity will work.

Once an instance is associated with an ENI, the operating system running on the instances receive IPv4 and IPv6 address for the associated network interfaces using Dynamic Host Configuration Protocol (DHCP). IP addresses received by the instance are the addresses associated with the ENI. DHCP also provides the instance with a default gateway, IP addresses of DNS servers, DNS domain, and other parameters like Network Time Protocol (NTP) servers. DHCP parameters are configured at a VPC level. This documentation provides more details on DHCP parameters.

The Amazon VPC network provides the DHCP server in the virtualization infrastructure, and the DHCP packets are handled locally without any broadcasts. So as a network engineer, you don’t need to think about how broadcast packets will be handled on the physical network, as Amazon VPC takes care of it.

Route Tables

Similar to NSX—where you create DLRs to enable Layer 3 routing between the Logical Switches and, consequently, enable routing for the related subnets—in Amazon VPC you associate VPC subnets with VPC route tables.

Each subnet within a VPC is associated with exactly one route table, which determines the Layer 3 forwarding rules for all packets originating from ENIs associated with the subnet. Each route table can be associated with multiple subnets of the VPC, and route tables have a scope of the entire VPC. Hence, you can associate a Route Table with any subnet of the VPC, irrespective of what Availability Zone the subnet is created in.

Each VPC can have multiple route tables to govern different forwarding behavior for various subnets of the VPC. For example, private subnets might have route tables with a default route to an NAT Gateway, and public subnets might be associated with route tables with a default route to the VPC Internet Gateway. This is analogous to having multiple DLRs in an NSX tenant network to govern different routing behavior for traffic from different logical switches.

A route table in a VPC thus represents a logical router which performs packet forwarding for all subnets associated with it. Just like the DLRs, the route table function is embedded in the virtualization infrastructure and can handle traffic sent to the default gateway of the VPC subnets.

In the three-tier application shown in Figure 4, the web, app, and database tiers are deployed in private subnets. By definition, the route table associated with these private subnets (route table 2) does not have a route to the Internet Gateway, and the application instances in these tiers do not have direct access to the Internet. Similarly, packets from the Internet cannot directly reach the application instances. On the other hand, ELB nodes are deployed in public subnets which are associated with a route table (route table 1) that has a route to the Internet Gateway, providing access to the ELB nodes.

An important distinction from NSX-based networking topologies is that each route table always has a local route to the VPC CIDR and this route cannot be removed or modified. Longest prefix match (LPM) rules do not apply to the VPC CIDR route.

Consequently, all traffic in the VPC with Layer 3 destinations in the VPC CIDR are always forwarded directly to the destination. This is applicable to all traffic destined to IP addresses of the interfaces associated with instances, Internet, and VPN Gateway, or VPC interface endpoints. This means that for the three-tier application in Figure 4, despite being associated with different route tables, the ELB nodes and instances deployed in the web, app, and database tiers can communicate with each other. This enables normal application traffic flow from the Internet to the ELB nodes, and from the ELB nodes to the instances in the web tier, and vice versa.

You can still use VPC security groups to segment traffic within a VPC subnet and VPC Network ACLs to filter traffic to/from a VPC subnet. However, you cannot insert a firewall appliance or load balancer appliance as a next-hop for any traffic between interfaces in the VPC.

Another point of note is that the same route table can handle both IPv4 and IPv6 routes. You can modify the forwarding behavior for the route table by adding static routes to it. As I discussed previously, the static routes cannot conflict with the VPC CIDR. It is possible to route traffic for destination prefixes to next-hops within the VPC, like Internet Gateway (IGW), VPN Gateway, VPC endpoints, VPC peers, or other ENIs in the VPC. We will discuss many of these VPC components in my next blog post.

Under the Hood: Amazon VPC Forwarding and Control Planes

So how does the forwarding and control plane for Amazon VPC work? The AWS re:Invent session titled Another Day, Another Billion Flows describes the control and forwarding planes of VPC, interaction of VPC with external networks, and flow tracking capabilities implemented within a VPC.

All communication within the VPC is unicast. Similar to the Address Resolution Protocol (ARP) suppression mechanism in NSX Logical Switches, all ARP requests are suppressed and replied to locally in Amazon VPC. Again, just like how DHCP packets are handled, you don’t need to think about how ARP broadcast packets will be handled on the physical network.

The VPC forwarding mechanism looks up the packet from the ENIs and, using the Mapping Service, makes a forwarding decision to the destination. Like the NSX controllers, the Mapping Service holds the forwarding information used by VPC forwarding components in the virtualization infrastructure. All packets are encapsulated using VPC encapsulation at the virtualization layer. The VPC encapsulation carries additional information like source and destination ENIs and VPC ID to the encapsulated packet.

You can now relate to the control, forwarding, and encapsulation techniques used in Amazon VPC. Suddenly, VPC networking looks much familiar, doesn’t it?

Amazon VPC Management Plane

NSX networks are managed through VMware NSX Manager APIs and through the Networking and Security plugin within vCenter. Similarly, management of Amazon VPC is programmatic and API-driven as well. Amazon VPCs are managed using the AWS Management Console, Amazon VPC APIs, and AWS Software Development Kits (SDKs). AWS has region-specific HTTPS API endpoints available for managing Amazon VPC and its components.

Conclusion

Now that you have a good understanding of the major VPC components, I recommend you create a VPC from scratch using the documentation here, which will help you consolidate many of the concepts introduced in this blog.

In my next post, I will discuss external connectivity options for connecting services deployed within a VPC to the Internet, corporate networks using VPN technologies, other VPCs using VPC peering, and to AWS and other partner services using VPC endpoints. I will also discuss security components within a VPC.

See you next time!