AWS Compute Blog

Announcing improved VPC networking for AWS Lambda functions

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.

Update – August 5, 2020: We have fully rolled out the changes to the following additional Regions to those mentioned below. These improvements are now available in the AWS China (Beijing) Region, operated by Sinnet and the AWS China (Ningxia) Region, operated by NWCD. For a complete list of Regions, please see the Region Table.

Update – March 31, 2020: We have fully rolled out the changes to the following additional Regions to those mentioned below: AWS GovCloud (US-West) and AWS GovCloud (US-East).

Update – November 28, 2019: We have fully rolled out the changes to the following additional Region to those mentioned below: Middle East (Bahrain).

Update – November 25, 2019: We have fully rolled out the changes to the following additional Regions to those mentioned below: US East (N. Virginia), US West (Oregon), Canada (Central), EU (London), EU (Stockholm), and Asia Pacific (Hong Kong). All AWS accounts in these Regions will see the improvements outlined in the original post.

Update – November 6, 2019: We have fully rolled out the changes to the following additional Regions to those mentioned below: US West (N. California), EU (Ireland), EU (Paris), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), and South America (San Paulo). All AWS accounts in these Regions will see the improvements outlined in the original post.

Update – September 27, 2019: We have fully rolled out the changes to the following Regions: US East (Ohio), EU (Frankfurt), and Asia Pacific (Tokyo). All AWS accounts in these Regions will see the improvements outlined in the original post.

Known Issues (updated October 4, 2019): If you use HashiCorp Terraform, VPC resources, such as subnets, security groups, and VPCs, can fail to be destroyed due to the change in how ENIs work in this new model. You can find the steps to take to resolve this, as well as more information here.


Original post from September 3, 2019:

We’re excited to announce a major improvement to how AWS Lambda functions work with your Amazon VPC networks. With today’s launch, you will see dramatic improvements to function startup performance and more efficient usage of elastic network interfaces. These improvements are rolling out to all existing and new VPC functions, at no additional cost. Roll out begins today and continues gradually over the next couple of months across all Regions.

Lambda first supported VPCs in February 2016, allowing you to access resources in your VPCs or on-premises systems using an AWS Direct Connect link. Since then, we’ve seen customers widely use VPC connectivity to access many different services:

  • Relational databases such as Amazon RDS
  • Datastores such as Amazon ElastiCache for Redis or Amazon Elasticsearch Service
  • Other services running in Amazon EC2 or Amazon ECS

Today’s launch brings enhancements to how this connectivity between Lambda and your VPCs works.

Previous Lambda integration model for VPCs

It helps to understand some basics about the way that networking with Lambda works. Today, all of the compute infrastructure for Lambda runs inside of VPCs owned by the service.Lambda runs in a VPC controlled by the service

When you invoke a Lambda function from any invocation source and with any execution model (synchronous, asynchronous, or poll based), it occurs through the Lambda API. There is no direct network access to the execution environment where your functions run:Invoke access only through the Lambda API

By default, when your Lambda function is not configured to connect to your own VPCs, the function can access anything available on the public internet such as other AWS services, HTTPS endpoints for APIs, or services and endpoints outside AWS. The function then has no way to connect to your private resources inside of your VPC.

When you configure your Lambda function to connect to your own VPC, it creates an elastic network interface in your VPC and then does a cross-account attachment. These network interfaces allow network access from your Lambda functions to your private resources. These Lambda functions continue to run inside of the Lambda service’s VPC and can now only access resources over the network through your VPC.

All invocations for functions continue to come from the Lambda service’s API and you still do not have direct network access to the execution environment. The attached network interface exists inside of your account and counts towards limits that exist in your account for that Region.lambda mapped to a customer eni

With this model, there are a number of challenges that you might face. The time taken to create and attach new network interfaces can cause longer cold-starts, increasing the time it takes to spin up a new execution environment before your code can be invoked.

As your function’s execution environment scales to handle an increase in requests, more network interfaces are created and attached to the Lambda infrastructure. The exact number of network interfaces created and attached is a factor of your function configuration and concurrency.many ENIs mapped to Lambda environments

Every network interface created for your function is associated with and consumes an IP address in your VPC subnets. It counts towards your account level maximum limit of network interfaces.

As your function scales, you have to be mindful of several issues:

  • Managing the IP address space in your subnets
  • Reaching the account level network interface limit
  • The potential to hit the API rate limit on creating new network interfaces

What’s changing

Starting today, we’re changing the way that your functions connect to your VPCs. AWS Hyperplane, the Network Function Virtualization platform used for Network Load Balancer and NAT Gateway, has supported inter-VPC connectivity for offerings like AWS PrivateLink, and we are now leveraging Hyperplane to provide NAT capabilities from the Lambda VPC to customer VPCs.

The Hyperplane ENI is a managed network resource that the Lambda service controls, allowing multiple execution environments to securely access resources inside of VPCs in your account. Instead of the previous solution of mapping network interfaces in your VPC directly to Lambda execution environments, network interfaces in your VPC are mapped to the Hyperplane ENI and the functions connect using it.
VPC to VPC NAT

Hyperplane still uses a cross account attached network interface that exists in your VPC, but in a greatly improved way:

  • The network interface creation happens when your Lambda function is created or its VPC settings are updated. When a function is invoked, the execution environment simply uses the pre-created network interface and quickly establishes a network tunnel to it. This dramatically reduces the latency that was previously associated with creating and attaching a network interface at cold start.
  • Because the network interfaces are shared across execution environments, typically only a handful of network interfaces are required per function. Every unique security group:subnet combination across functions in your account requires a distinct network interface. If a combination is shared across multiple functions in your account, we reuse the same network interface across functions.
  • Your function scaling is no longer directly tied to the number of network interfaces and Hyperplane ENIs can scale to support large numbers of concurrent function executions

Hyperplane ENIs are tied to a security group:subnet combination in your account. Functions in the same account that share the same security group:subnet pairing use the same network interfaces. This way, a single application with multiple functions but the same network and security configuration can benefit from the existing interface configuration.

Several things do not change in this new model:

  • Your Lambda functions still need the IAM permissions required to create and delete network interfaces in your VPC.
  • You still control the subnet and security group configurations of these network interfaces. You can continue to apply normal network security controls and follow best practices on VPC configuration.
  • You still have to use a NAT device(for example VPC NAT Gateway) to give a function internet access or use VPC endpoints to connect to services outside of your VPC.
  • Nothing changes about the types of resources that your functions can access inside of your VPCs.

As mentioned earlier, Hyperplane now creates a shared network interface when your Lambda function is first created or when its VPC settings are updated, improving function setup performance and scalability. This one-time setup can take up to 90 seconds to complete. If your Lambda function is invoked while the shared network interface is still being created, the invocation still succeeds but may see longer cold-start times. When setup is complete, your Lambda function no longer requires network interface creation at invocation time. If Lambda functions in an account go idle for consecutive weeks, the service will reclaim the unused Hyperplane resources and so very infrequently invoked functions may still see longer cold-start times.

As I mentioned earlier, you don’t have to enable this new capability yourself, and there are no new associated costs. Starting today, we are gradually rolling this out across all AWS Regions over the next couple of months. We will update this post on a Region by Region basis after the rollout has completed in a given Region.

If you have existing VPC-configured Lambda workloads, you will soon see your applications scale and respond to new invocations more quickly. You can see this change using monitoring and profiling tools for Lambda like AWS X-Ray or those from APN Partners.

The following screenshots show a before/after example taken with AWS X-Ray:

The first shows a function that was launched in a VPC, with a full cold start with network interface creation at invocation.

In this second image, you see the same function executed but this time in an account that has the new VPC networking capability. You can see the massive difference that this has made in function execution duration, with it dropping to 933 ms from 14.8 seconds!

Conclusion

On the AWS Lambda team, we consider “Simplicity” a core tenet to how we think about building services for our customers. We’re constantly evaluating ways that we can improve and simplify the experience that you see in building serverless applications with Lambda. These changes in how we connect with your VPCs improve the performance and scale for your Lambda functions. They enable you to harness the full power of serverless architectures.