AWS Security Blog
Establishing a data perimeter on AWS: Require services to be created only within expected networks
November 13, 2024: This post has been updated to reflect the usage of resource control policies (RCPs) to establish your organization’s data perimeter.
Welcome to the fifth post in the Establishing a data perimeter on AWS series. Throughout this series, we’ve discussed how a set of preventative access controls can create an always-on boundary to help ensure that your trusted identities are accessing your trusted resources over expected networks. In a previous post, we demonstrated how you can help prevent access from unexpected locations, even for authorized users. For example, you might not want non-public corporate data to be accessed from outside the corporate network. In this post, we demonstrate how to use preventative controls to help ensure that your resources are deployed within your Amazon Virtual Private Cloud (Amazon VPC), so that you can effectively enforce the network perimeter controls. We also explore detective controls you can use to detect the lack of adherence to this requirement.
Let’s begin with a quick refresher on the fundamental concept of data perimeters using Figure 1 as a reference. Customers generally prefer establishing a high-level perimeter to help prevent untrusted entities from coming in and data from going out. The perimeter defines what access customers expect within their AWS environment. It refers to the access patterns among your identities, resources, and networks that should always be blocked. Using those three elements, an assertion can be made to define your perimeter’s goal: access can only be allowed if the identity is trusted, the resource is trusted, and the network is expected. If any of these conditions are false, then the access inside the perimeter is unintended and should be denied. The perimeter is composed of access controls implemented on your identities, resources, and networks to maintain that the necessary conditions are true.
Now, let’s consider a scenario to understand the problem statement this post is trying to solve. Assume a setup like the one in Figure 2, where an application needs to access an Amazon Simple Storage Service (Amazon S3) bucket using its temporary AWS Identity and Access Management (IAM) credentials over an Amazon S3 VPC endpoint.
From our previous posts in this series, we’ve learned that we can use the following set of capabilities to build a network perimeter to achieve our control objectives for this sample scenario.
Control objective | Implemented using | Applicable IAM capability |
My identities can access resources only from expected networks. For example, in Figure 2, my application’s temporary credential can only access my S3 bucket when my application is within my expected network space. | Service control policies (SCPs) | aws:SourceIp aws:SourceVpc aws:SourceVpce |
My resources can only be accessed from expected networks. For example, in Figure 2, my S3 bucket can only be accessed from my expected network space. | Resource control policies (RCPs) | aws:SourceIp aws:SourceVpc aws:SourceVpce |
Note: If you need to enforce network perimeter controls on resources that are currently not supported by RCPs, you can use resource-based policies, which are policies that are attached to resources directly. For a list of services that support RCPs and resource-based policies, see Resource control policies and AWS services that work with IAM, respectively.
There are certain AWS services that allow for different network deployment models, such as providing the choice of associating the service resources with either an AWS managed VPC or a customer managed VPC. For example, an AWS Lambda function always runs inside a VPC owned by the Lambda service (AWS managed VPC) and by default isn’t connected to VPCs in your account (customer managed VPC). For more information, see Connecting Lambda functions to your VPC.
This means that if your application code was deployed as a Lambda function that isn’t connected to your VPC, then the function cannot access your resources with standard network perimeter controls enforced. Let’s understand this situation better using Figure 3, where a Lambda function isn’t configured to connect to the customer VPC. This function cannot access your S3 bucket over the internet because of how the recommended data perimeter in the preceding table has been defined, that is, to only allow your bucket to be accessible from a known network segment (the customer VPC and IP CIDR range) and only allow the IAM role associated with the Lambda function to access the bucket from known networks. The function also cannot access your S3 bucket through your S3 VPC endpoint because the function isn’t associated with the customer VPC. Lastly, unless other compensating controls are in place, this function might be able to access untrusted resources as your standard data perimeter controls enforced with the VPC endpoint policies won’t be in effect, which might not meet your company’s security requirements.
This means that for the Lambda function to conform to your data perimeter, it must be associated with your network segment (customer VPC) as shown in Figure 4.
To make sure that your Lambda functions are deployed into your networks so that they can access your resources under the purview of data perimeter controls, it’s preferable to have a way to automatically prevent deployment or configuration errors. Additionally, if you have a large deployment of Lambda functions across hundreds or even thousands of accounts, you want an efficient way to enforce conformance of these functions to your data perimeter.
To solve for this problem and make sure that an application team or a developer cannot create a function that’s not associated with your VPC, you can use the lambda:VpcIds or lambda:SubnetIds IAM condition keys (for more information, see Using IAM condition keys for VPC settings). These keys allow you to create and update functions only when VPC settings are satisfied.
In the following SCP example, an IAM principal that is subject to the following SCP will only be able to create or update a Lambda function if the function is associated with a VPC (customer VPC). When the customer VPC isn’t specified, the lambda:VpcIds condition key has no value—it is null—and thus this policy will deny creating or updating the function. For more information about how the Null condition operator functions, see Condition operator to check existence of condition keys.
Additionally, you can use variations of the preceding example and create more fine-grained controls using these condition keys. For more such examples, see Example policies with condition keys for VPC settings.
AWS services such as AWS Glue and Amazon SageMaker have similar feature behavior and provide similar condition keys. For example, the glue:VpcIds condition key allows you to govern the creation of AWS Glue jobs only in your VPC. For further details and an example policy, see Control policies that control settings using condition keys.
Similarly, Amazon SageMaker Studio, SageMaker notebook instances, SageMaker training, and deployed inference containers are internet accessible or enabled by default. The sagemaker:VpcSubnets condition key can be used to restrict launching these resources in a VPC. For more information, see Condition keys for Amazon SageMaker, Connect to Resources From Within a VPC, and Run Training and Inference Containers in Internet-Free Mode.
Detective controls
The AWS Well-Architected Framework recommends applying a defense in-depth approach with multiple security controls (see Security Pillar). This is why, in addition to the preventative controls discussed in the form of condition keys in this post, you should also consider using AWS native fully-managed governance tools to help you manage your environment’s deployed resources and their conformance to your data perimeter (see Management and Governance on AWS).
For example, AWS Config provides managed rules to check for Lambda functions inside a VPC and Sagemaker notebooks inside a VPC. You can also use the built-in checks of AWS Security Hub to detect and consolidate findings, such as [Lambda.3] Lambda functions should be in a VPC and [SageMaker.2] SageMaker notebook instances should be launched in a custom VPC.
You can also use similar detective controls for AWS services that don’t currently offer built-in preventative controls. For example, OpenSearch Service has an AWS Config managed rule for OpenSearch in VPC only and security hub check for [Opensearch.2] OpenSearch domains should be in a VPC.
Conclusion
In this post, we discussed how you can enforce that specific AWS services resources can only be created such that they adhere to your data perimeter. We used a sample scenario to dive into AWS Lambda and its network deployment options. We then used IAM condition keys as preventative controls to enforce predictable creation of Lambda functions conforming with our security standard. We also discussed additional AWS services that have similar behavior when the same concepts apply. Finally, we briefly discussed some AWS provided managed rules and security checks that you can use as supplementary detective controls to ensure that your preventative controls are in effect as expected.
Additional resources
The following are some additional resources that you can use to further explore data perimeters.
- Data Perimeter Policy Examples
- Building a Data Perimeter on AWS
- Data perimeters on AWS
- Data Perimeter Workshop
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.