Governance Patterns to Manage Private Workloads through Cloud Operations Services

Introduction

For enterprises, one of the larger obstacles when adopting and migrating to the cloud is how to establish a well-thought-out cloud governance model to meet internal or regulatory compliance requirements. One common inhibitor in the field is that enterprises seek to come up with a one-size-fits-all approach to cloud governance for all workloads. We don’t recommend this approach for enterprises with highly diverse workloads.

In this post, we will establish a governance framework for protecting private (non-consumer facing) workloads with highly sensitive data. This governance framework will help large organizations scale their governance processes by establishing common guardrails by workload type rather than developing a one-size-fits-all approach for workloads running on AWS.

We will be covering the following five key patterns:

Use AWS Organizations to set distinct environment boundaries
Use Amazon VPC Endpoints to secure network traffic
Use the right type of encryption
Use AWS Service Catalog to enable approved AWS resources
Use AWS CloudTrail and AWS Config for compliance monitoring

Overall Solution Architecture

The following architecture depicts a common pattern of how network traffic can be traversed from on-premises to AWS privately (and vice-versa) by using a combination of AWS Direct Connect and Interface endpoints (the interface endpoints inherit whatever IP space from the subnet they are placed in). This makes them routable or an extension of your corporate data center.

Reference Architecture highlighting secure governance patterns to manage private workloads

Note: The architecture above illustrates a Single Availability Zone (AZ) for brevity and we recommend customers to adopt a Multi-AZ in such architectures.

Using the illustration above, let’s break down the patterns in more detail.

1. Use AWS Organizations to set workload boundaries

Today, enterprises face on-going challenges regarding how to effectively manage their growing cloud environments. Increasingly, as enterprises grow their footprint on the cloud, they must grapple with how to also govern these workloads. In some cases, these can be internal requirements where your information security team scrutinizes cloud-based workloads more heavily. On the other hand, these can be industry specific compliance requirements, such as PCI-DSS, HIPAA, or Federal workloads like CMMC.

We will use the following tenet to guide this pattern: how do we empower builders to be creative within a secure environment without compromising security?

AWS Organizations lets you establish environment boundaries that can group similar workloads together. Organizations let customers setup and manage multiple accounts as one organization. At the top-level, you can define Organizational Units (OUs) to establish segmented environments by workload type. Customers often model their OUs by business units. For example, you may want to create an OU for all of your consumer facing workloads, another OU for private line-of-business (LOB) applications, and one for industry specific regulations such as PCI-DSS. Refer here for a deep dive on design principles for organizing your AWS account.

To achieve this pattern, AWS Organizations uses Service Control Policies (SCPs). SCPs are different from IAM policies in that SCPs are applied at a higher level: Organization (root), OUs, and individual accounts, whereas IAM policies are applied only to IAM identities (users, groups, and roles). You can also think of SCPs as a type of environment “guardrail”. Setting guardrails lets enterprises enforce key governance controls to comply with specific policies required by your compliance teams or to align with industry specific frameworks.

For example, let’s say you create an OU LOB for all of your internal facing workloads. Then, you can create fine-grained policies that would apply to all of the accounts within this OU. These policy restrictions might include certain boundaries and restrictions, such as creating and associating public IPs to Amazon Elastic Compute Cloud (EC2) instances. This creates internet gateways for your VPCs, and requires encryption for Amazon Elastic Block Store (EBS) volumes or Amazon Simple Storage Service (S3) buckets. You can even establish trusted CIDR ranges for these workloads, so that if an API call comes from outside of this range it would be blocked automatically. For more details on SCPs and understanding policy elements, see this post.

Architecture highlighting concepts of AWS Organizations to manage large-scale accounts.

Key takeaway: AWS Organizations lets enterprises establish well-defined environment boundaries. Similar workloads, such as LOB applications, can be segmented away from consumer-facing workloads. Following this pattern means you can scale your governance based on workload characteristics. To get started using this pattern, AWS recommends beginning with AWS Control Tower. AWS Control Tower automates landing zone creation, along with best practices blueprints that configure AWS Organizations for customers that require a multi-account environment.

2. Use VPC Endpoints to secure network traffic

VPC Endpoints and their importance

Many customers want to set up VPC endpoints (VPCE) to privatize traffic between their corporate data centers and AWS. VPC Endpoints let customers securely access AWS services through a private connection, thereby avoiding routing through public IP address space and avoiding VPC-attached internet gateways.

AWS Provides two types of VPC Endpoints

Interface endpoint (powered by PrivateLink): This is a collection of one or more elastic network interfaces (ENIs) with a private IP address that serves as an entry point for traffic designated to a supported service. In other words, when an interface endpoint is deployed, it inherits whatever CIDR block is assigned to that subnet. As a result, if it is connected to your on-premises environment, then the service will be routable on your private network.

Gateway endpoint: These differ from Interface endpoints in that they’re not routable from your on-premises environment. Gateway endpoints target specific IP routes in your VPC route to make sure that the traffic stays on the AWS network. Only Amazon DynamoDB and Amazon S3 support Gateway endpoints. For S3, in a hybrid model to keep S3 traffic private, we recommend using both interface endpoints and gateway endpoints. By using this pattern, you can optimize costs as gateway endpoints do not incur costs.

For private workloads that require no internet connectivity, this is a highly desirable construct. By keeping the network private, the attack vector is reduced since traffic isn’t routed through public domains. From a governance standpoint, you can set secure boundaries by creating VPC Endpoint policies. Furthermore, VPC Endpoint policies let you set specific policies to control access.

Lastly, VPC Endpoints also support Security Groups, which provide another level of protection. We recommend using AWS Firewall Manager to control security groups centrally within your Enterprise. As illustrated in the following section, Firewall Manager is also integrated with AWS Organizations, so you can set baseline security groups for OUs containing private workloads. We recommend placing Firewall Manager in one of your core infrastructure accounts that can be centrally managed for your tenant accounts.

FIPS Endpoints

For customers that require FIPS validated endpoints, one important property is that VPCEs on their own can’t be “FIPS validated”, because a VPCE purpose is to simply provide a path that passes traffic to the target service endpoint within the AWS infrastructure. For example, in AWS GovCloud (US) regions, Amazon S3 provides both an FIPS validated endpoint and a non-FIPS endpoint. If you require FIPS validated endpoints for compliance reasons, then we recommend referring to the AWS FIPS compliance page for additional details.

Key takeaway: Use VPC Endpoints for your private workloads to keep network traffic private between your datacenter and AWS. You should also deploy AWS Firewall Manager to establish baseline security policies by workload type.

3. Using Right Encryption Mechanisms

In this section, we will discuss the role of encryption and decouple application level encryption from data-at-rest encryption.

Application Level Encryption: So far, we have discussed patterns regarding how to protect data in transit using VPC Endpoints. This makes sure that traffic is routed privately and where FIPS enabled endpoints play an important role. Application-level or client-side encryption is another factor to consider when storing/processing highly regulated data. For application-level encryption, consider using client-side encryption, such as AWS KMS managed encryption keys. Many of the AWS SDKs support varying encryption levels, as outlined on the AWS Security Blog. A good starting point would be to review the AWS Encryption SDK, which is a client-side encryption library that lets developers focus on their application functionality while leveraging industry standards and best practices for client-side encryption.

A good use case for encrypting data client-side is when processing data using AWS Lambda. Lambda provides numerous storage options to store data, such as Amazon S3, Amazon Elastic File System (EFS), or the /tmp directory. There may be scenarios where your function must temporarily store data in the /tmp directory. This directory is provided as part of the Lambda execution environment, and it is considered ephemeral with a fixed size of 512 MB. Since this directory is not encrypted, sensitive data should first be encrypted client-side. One solution is to use the AWS Encryption SDK and KMS and decrypt the data within your function.

Data at Rest: When storing regulated data at rest on data stores, such as Amazon S3, EFS or EBS, data should be encrypted using AWS KMS. The AWS KMS service provides native hardware security modules (HSMs) that have been validated by FIPS 140-2. Enterprises that have numerous accounts, or are concerned about enforcing these policies, should consider using IAM condition keys or defining service control policies within AWS Organizations. For example, to make sure of the enforcement of an Amazon EFS filesystem, you can use the elasticfilesystem: Encrypted IAM condition key in AWS Identity and Access Management (IAM) identity-based policies to control whether users can create Amazon EFS file systems that are encrypted at rest. Similarly, this can be done with both Amazon S3 and EBS.

4. Use AWS Service Catalog to enable approved AWS resources

Enterprises have increasingly been adopting a centralized cloud delivery model. In this model, the central body is responsible for setting up the cloud footprint (e.g., AWS Control Tower, AWS Landing Zone, VPC), usage guardrails, and centralized security and monitoring systems. On the other hand, the business units are responsible for developing their application within the boundaries set by the common control teams.

The pattern covered here is how to use the AWS Service Catalog to enable automated enforcement of approved AWS resources for the organization. AWS Service Catalog lets organizations create and manage catalogs of IT services that are approved for use on AWS.

Provisioning secure baselined accounts for workloads: After Enterprise CCOE has a pattern for a compliant workload, the pattern is easily consumable by the business teams using the AWS Service Catalog. This provides an internal, controlled list of application templates that teams can deploy. Similar functionality can be found in services such as AWS Proton, which provides approved container workload patterns.

Achieving Event-Driven Security and Compliance through Catalog

AWS recognizes that security is paramount for customers and defines their architectures for regulated workloads. AWS provides services that make it easy for customers to adopt DevSecOps and maintain a strong security posture.

Using AWS Service Catalog, you can let your builder teams quickly choose from a securely curated set of approved resources while factoring in security and approving the service for service catalog. For example, the builder may choose a specific database service from the AWS Service Catalog menu and be assured that the security configuration conforms to InfoSec policies, as well as provides for time savings required for repetitive approvals every-time that a service is used. Moreover, product owners can use the AWS Service Catalog to manage costs. For example, this can be achieved by allowing the limited sizing of resources in sandbox environments.

For example, a security portfolio can be automatically setup with AWS Service Catalog controls, where detective controls are continually run to detect noncompliant runtime resources similar to those run as part of the preventative controls. Another example is a database portfolio being setup with approved data governance rules and choice of data engines as well the approved configurations based on the organization’s impact assessment levels.

Baseline Security Profile: The AWS Service Catalog can be used to quickly spin up new accounts with a preset security baseline. This is achieved through an “account vending machine” approach, where accounts are provisioned through a governed and standardized approach. For more information, visit here.

AWS Service Catalog AppRegistry: Enterprises often use Artifactory, or a flavor of Artifactory, to highlight approved products. Extending that to the cloud as a pattern, AppRegistry lets organizations understand the application context of their AWS resources. AppRegistry creates a repository of your applications and associated resources for the information that describes the applications and associated resources that you use within your enterprise.

Control Plane: Control Plane services (such as Amazon EC2 Control Plane and Amazon EKS Control Plane) are designed to manage other services or constructs within AWS and don’t store application data. The AWS Service Catalog falls under this construct. Therefore, it should be segregated from the data controls of the organization. Refer here for more details on control plane architectures.

5. Use CloudTrail and Config for compliance monitoring

The pattern covered here is how to establish continuous compliance monitoring for your environment. We’ll cover this in two sections: monitoring data events and detective controls.

Monitoring Data Events

AWS accounts that will be storing or processing sensitive data should also consider enabling CloudTrail data events. Data events provides visibility into resource operations performed on or within a resource which are typically high-volume activities. Data events focus on data plane operations, whereas standard CloudTrail trails focus on control pane operations. The following data types can be recorded when enabling CloudTrail data events:

Amazon S3 object-level API activity (for example GetObject, DeleteObject, and PutObject API operations)
AWS Lambda function execution activity (the Invoke API)
Amazon DynamoDB object-level API activity on tables (for example PutItem, DeleteItem, and UpdateItem API operations).

Note that enabling data events incurs additional costs. AWS recommends enabling data events only when required.

CloudTrail management events (also known as “control plane operations”) show management operations that are performed on resources in your AWS account. Looking at IAM management events as an example, IAM provides a number of audit events, such as time of action, assume role, and actions by each user/role. These are provided seamlessly via AWS CloudTrail, and they should be used diligently, such as not sharing roles for sensitive workloads. This should be supplemented with enhanced auditing processes for sensitive data applications, which can be automated via CloudWatch Alarms and alerts.

Auditing and Compliance

Your company will likely want to establish a governance pattern to continually audit your workloads to make sure that they stay compliant. This process should also let you view historical information on how AWS resources change over time. AWS Config provides continuous compliance monitoring for your environment. We recommend using conformance packs as a starting point to see what rules you would want to establish by workload type. If you’re grouping certain workloads by specific compliance standards, such as PCI-DSS, then you can deploy the specific conformance for PCI-DSS by OU. This means that all of the workloads that fall under this OU will be monitored per the rules defined in this conformance pack. Alternatively, AWS Security Hub has built-in monitoring for certain compliance standards, such as CIS and PCI-DSS. For a deep dive on AWS Config best practices, see this post.

To achieve auditing and compliance at scale, you should create different permissions and rules by each OU. You can use AWS CloudFormation StackSets to deploy these rules sets targeted to specific OUs. This means that as new rules need to get added or modified they can be easily managed through StackSets. For example, if you have an internal workload that processes sensitive employee information, you may want to deploy a new rule that verifies that all of your S3 buckets storing this information is encrypted and public read/write is disabled.

Key takeaway: Enable data events for AWS accounts hosting workloads with sensitive data. For auditing and compliance, create targeted config rules that match your workload profiles. Start with conformance packs as a baseline, and deploy using StackSets to the target OUs.

Conclusion

In this post, we established a governance framework using five key patterns to protect private workloads that contain sensitive data. Using the five patterns, enterprises can scale their governance and compliance requirements by establishing common guardrails to protect these workload types hosted on AWS. Key enablers for this framework included (1) using AWS Organizations to set logical boundaries for private workloads, (2) using VPC Endpoints to communicate privately between your on-premises environment and AWS, (3) using the right encryption, (4) utilizing the AWS Service Catalog to establish a trusted set of products that can be consumed by application teams, and (5) using AWS CloudTrail and AWS Config to establish a continuous compliance monitoring framework. By using this framework, enterprises can confidently secure their private workloads by setting distinct workload boundaries using AWS Organizations and make sure that exposure to the public internet is minimized through VPC Endpoints.

References:

AWS Cloud Operations Blog