Securing Workloads on VMware Cloud on AWS Using Native AWS Services
By Haider Witwit, Senior Solutions Architect at AWS
With the recent launch of VMware Cloud on AWS, you can now run workloads on VMware-managed Software-Defined Data Center (SDDC) clusters installed on special bare metal hardware provided by Amazon Elastic Compute Cloud (Amazon EC2) services.
VMware Cloud on AWS brings VMware’s SDDC to the AWS Cloud. It is delivered, sold, and supported by VMware as an on-demand, elastically-scalable cloud service that removes barriers to cloud migration and cloud portability, increases IT efficiency, and opens up new opportunities to leverage a hybrid cloud environment. The service is currently available in the AWS US West (Oregon), US East (N. Virginia), and London Regions.
This post describes a solution for securing workloads on VMware Cloud on AWS that we demonstrated at VMworld 2017. VMware workloads that run in the SDDC cluster can leverage different levels of AWS network and application protection capabilities with minimum to no changes to their application settings.
The solution focuses on three primary goals:
- Provide centralized and secure external application access.
- Accelerate and optimize content delivery for distributed users with advanced application protection.
- Provide data analytics to increase visibility and gain better insights.
We leveraged Application Load Balancer as a single point of management to distribute traffic to workloads running in both customer-managed virtual private cloud (VPC) and SDDC clusters.
Figure 1 shows our overall solution with sample applications running in the VMware-managed SDDC cluster on the right. The cluster runs on special bare metal Amazon EC2 instances (i3p.16xlarge) in a single tenant VPC owned and operated by VMware. The smallest SDDC cluster size starts with four nodes and can scale up to 32 nodes, with future capacity to scale up to 64 cluster nodes.
Each host has 64 vCPU, 512 GB memory, and 14 TB raw storage (10.4TB of usable capacity after vSAN caching and metadata). The NSX platform builds an overlay network on top of Amazon EC2 to create logical customer networks and manage connectivity within SDDC boundaries.
As part of the SDDC creation process, customers need to create network mapping to a managed VPC for integration with VPC workloads and other AWS services. This mapping is provided by the Enhanced Network Adapter that operates in a customer’s VPC and is utilized by the NSX edge appliance installed in the linked SDDC for east-west traffic (VPC to SDDC workloads).
Now, let’s get into the details of how we achieved the three architecture goals.
1. Enable Secure and Centralized Access to External Applications
AWS offers three types of load balancers featuring the high availability, automatic scaling, and robust security necessary for application fault tolerant. We used Application Load Balancer as a single point of management to distribute traffic to workloads running in both customer-managed VPC and SDDC clusters.
Application Load Balancer acts as a line of defense between the internet and workloads behind it. It is one of the services that helps build DDoS resiliency and scales to handle unexpected volume of traffic within a given AWS Region.
Application Load Balancer only supports valid TCP connections, so DDoS attacks such as UDP and SYN floods are not able to reach protected resources behind the balancer. It also provides support for SSL offloading using custom certificates or certificates generated by the AWS Certificate Manager (ACM) service.
Name resolution is provided by Amazon Route 53—a highly available and scalable DNS service that includes many advanced features like latency-based routing, Geo DNS, health checks, and monitoring. Route53 uses shuffle sharding, anycast stripping, and integration with AWS shield for DDoS mitigation.
In our design, we have two applications running in SDDC and a third running inside the customer’s VPC to show how a single Application Load Balancer instance can route traffic to different applications across the two environments. Application Load Balancer is configured with single HTTPS 443 listener with multiple rules to route traffic to the appropriate target group based on host header value.
Each target group points to a specific application and supports EC2 instances and IP-based targets. We used the first for VPC apps and the latter to directly add targets from the SDDC cluster. The IP-based routing is a new feature of the Application Load Balancer. Figure 2 shows a host-based routing setup in Application Load Balancer with rules to route incoming traffic based on domain name specifics to the proper target group.
2. Accelerate and Optimize Content Delivery for Distributed Users with Advanced Application Protection
Distributed users can suffer slow content delivery and poor experience due to high network latency and low throughput to where data resides. To address this, we used Amazon CloudFront to accelerate and optimize content delivery. CloudFront, like Application Load Balancer, supports SSL termination and accepts only well-formed connections. It can prevent many common DDoS attacks by geographically isolating attacks on the infrastructure edge closer to the source, and then scaling to absorb DDoS attacks.
In addition, CloudFront adds additional layers of application protection by integrating with AWS WAF for L7 attack mitigation against common web exploits like SQL injection and cross-site scripting attacks.
In our solution, we have CloudFront distribution with Application Load Balancer origin supporting HTTPS connections only. For WAF configuration, we used a complete automated solution explained in AWS WAF Security Automations.
The solution adds basic AWS WAF rules to block common SQL injection and XSS attacks. It uses AWS Lambda functions to automatically parse CloudFront access logs looking for suspicious activities, such as high requests or errors from single IP, and then automatically blocks source addresses. The second function leverages information from third-party reputation lists to block request from malicious IP addresses. This integration is demonstrated in Figure 3.
3. Provide Data Analytic Capabilities to Increase Visibility
Collecting and analyzing logs is an important practice to understand application behavior, troubleshooting, and identifying abnormal actives. In this section, we will shed some light on the different infrastructure and application logs you can utilize in our solution:
- Amazon VPC flow logs capture information about IP traffic going to and from network interfaces in Amazon VPC. Information includes IPs, protocols, number of packets, bytes, time stamp, and action taken.
- Application Load Balancer captures detailed information about requests. Each log contains information such as the time the request was received, client IP, latencies, request paths, and server responses.
- Amazon CloudFront provides detailed information about every user request it receives for data such as date, time, edge location, number of bytes, client IP, HTTP method, and HTTP status codes.
- AWS CloudTrail provides event history of the AWS account activity, including actions taken through the AWS Management Console, AWS Software Development Kits (SDKs), command line tools, and other services.
- For Amazon EC2 applications, you can publish logs data from instances running Linux or Windows Server in Amazon CloudWatch Logs using CloudWatch agents.
- For SDDC applications, you can use Amazon Kinesis Firehose agents to collect and send logs. Kinesis Firehose is a fully managed service for delivering real-time streaming data to destinations like Amazon S3.
Since all logs contain information about the requester’s address, learning the physical locations of where our infrastructure and applications are being accessed can be helpful to analyze access patterns and identify suspicious activities. There are multiple methods to add GeoIP metadata to our logs.
In our solution, we used a serverless approach with AWS Lambda to transform the logs into a single format and enriched the data by adding the Geo IP coordination. Figure 4 shows the data flow from different log sources to AWS Lambda for transformation and data enrichment, Amazon S3 for long term storage, then the two data analytic approaches.
The enriched data is then stored in Amazon S3 with the necessary lifecycle policies to meet logs retention requirements. From there, we suggest two approaches to analyze logs data.
First, extract the enriched data using Kinesis Firehose to an AWS-managed Elasticsearch Service cluster for use cases like log analytics, full text search, real-time application monitoring, and clickstream analytics. For visualization and exploration, we utilized the built-in Kibana endpoint included in the managed Elasticsearch domain.
Next, start analyzing data immediately with Amazon Athena. We don’t need to load any data into Athena, as it works directly with data stored in Amazon S3. Athena is ideal for quick, ad-hoc querying and integrates with Amazon QuickSight for easy visualization with no need to setup or manage any servers.
You can choose either one or both of these approaches based on your analytics requirements. An automated process to provision the AWS Lambda function for log enrichment and data analytics services can be launched from this template.
AWS has a rich portfolio of security, network, storage, and other services that can be utilized by workloads running in the SDDC clusters provided by VMware Cloud on AWS. These capabilities help customers consolidate external access to applications running across VPC and SDDC environments and deploy advanced levels of application protection at the infrastructure edge closer to end users.
The tight integration between AWS and VMware environments enables you to enhance the security, availability, and reliability of your applications with the hybrid platform capabilities.
For additional resources and to get started:
- VMware Cloud on AWS website
- Getting Started guide
- Sign up for our VMware Cloud on AWS mailing list
- Learn more from VMware