Optimizing AWS Lambda function performance for Java
This post is written by Mark Sailes, Senior Specialist Solutions Architect.
This blog post shows how to optimize the performance of AWS Lambda functions written in Java, without altering any of the function code. It shows how Java virtual machine (JVM) settings affect the startup time and performance. You also learn how you can benchmark your applications to test these changes.
When a Lambda function is invoked for the first time, or when Lambda is horizontally scaling to handle additional requests, an execution environment is created. The first phase in the execution environment’s lifecycle is initialization (Init).
For Java managed runtimes, a new JVM is started and your application code is loaded. This is called a cold start. Subsequent requests then reuse this execution environment. This means that the Init phase does not need to run again. The JVM will already be started. This is called a warm start.
In latency-sensitive applications such as customer facing APIs, it’s important to reduce latency where possible to give the best possible experience. Cold starts can increase the latency for APIs when they occur.
How can you improve cold start latency?
Changing the tiered compilation level can help you to reduce cold start latency. By setting the tiered compilation level to 1, the JVM uses the C1 compiler. This compiler quickly produces optimized native code but it does not generate any profiling data and never uses the C2 compiler.
Tiered compilation is a feature of the Java virtual machine (JVM). It allows the JVM to make best use of both of the just-in-time (JIT) compilers. The C1 compiler is optimized for fast start-up time. The C2 compiler is optimized for the best overall performance but uses more memory and takes a longer time to achieve it.
There are five different levels of tiered compilation. Level 0 is where Java byte code is interpreted. Level 4 is where the C2 compiler analyses profiling data collected during application startup. It observes code usage over a period of time to find the best optimizations. Choosing the correct level can help you optimize your performance.
Changing the tiered compilation level to 1 can reduce cold start times by up to 60%. Thanks to changes in the Lambda execution environment, you can do this in one step with an environment variable for all Java managed runtimes.
Language-specific environment variables
Lambda supports the customization of the Java runtime via language-specific environment variables. The environment variable JAVA_TOOL_OPTIONS allows you to specify additional command line arguments to be used when Java is launched. Using this environment variable, you can change various aspects of the JVM configuration including garbage collection functionality, memory settings as well as the configuration for tiered compilation. To change the tiered compilation level to 1 you would set the value of JAVA_TOOL_OPTIONS to “-XX:+TieredCompilation -XX:TieredStopAtLevel=1”. When the Java managed runtime starts any value set will be included in the program arguments. For more information on how you can collect and analyses garbage collection data read our Field Notes: Monitoring the Java Virtual Machine Garbage Collection on AWS Lambda.
Customer facing APIs
The following diagram is an example architecture that might be used to create a customer-facing API. Amazon API Gateway is used to manage a REST API and is integrated with Lambda to handle requests. The Lambda function reads and writes data to Amazon DynamoDB to serve the requests.
This is an example use case, which would benefit from optimization. The shorter the duration of each request made to the API the better the customer experience will be.
You can explore the code for this example in the GitHub repo: https://github.com/aws-samples/aws-lambda-java-tiered-compilation-example. The project includes the Lambda function source code, infrastructure as code template, and instructions to deploy it to your own AWS account.
Measuring cold starts
Before you add the environment variable to your Lambda function, measure the current duration for a request. One way to do this is by using the test functionality in the Lambda console.
The following screenshot is a summary from a test invoke, run from the console. You can see that it is a cold start because it includes an Init duration value. If the summary doesn’t include an Init duration, it is a warm start. In this case, the duration is 5,313ms.
Applying the optimization
Using the AWS Management Console:
- Navigate to the AWS Lambda console.
- Choose Functions and choose the Lambda function to update.
- From the menu, choose the Configuration tab and Environment variables. Choose Edit.
- Choose Add environment variable. Add the following:
– Key: JAVA_TOOL_OPTIONS – Value: -XX:+TieredCompilation -XX:TieredStopAtLevel=1
- Choose Save. You can verify that the changes are applied by invoking the function and viewing the log events in Amazon CloudWatch. The log line Picked up _JAVA_OPTIONS: -XX:+TieredCompilation -XX:TieredStopAtLevel=1 is added by the JVM during startup.
Checking if performance has improved
Invoke the Lambda function again to see if performance has improved.
The following screenshot shows the results of a test for a function with tiered compilation set to level 1. The duration is 2,169 ms. The cold start duration has decreased by 3,144 ms (59%).
Other use cases
This optimization can be applied to other use cases. Examples could include image resizing, document generation and near real-time ETL pipelines. The common trait being that they do a small number of discrete pieces of work in each execution.
The function code doesn’t have as many candidates for further optimization with the C2 compiler. Even if the C2 compiler did make further optimizations there wouldn’t be enough usage of those optimizations to decrease the total execution time. Instead of allowing this extra compilation to happen, you can tell the JVM not to use the C2 compiler and only use C1.
This optimization may not be suitable if a Lambda function is running for minutes or is repeating the same piece of code thousands of times within the same execution. Frequently executed sections of code are called hot spots, and are prime candidate for further optimization with the C2 compiler.
The C2 compiler analyses profiling data collected as the application runs, and produce a more efficient way to execute that piece of code. After the optimization by the C2 compiler that section of code would execute quicker. Because it is repeated thousands of times in a single Lambda invocation, the overhead of the optimization is worth it overall. An example use case where this would happen is in Monte Carlo simulations. Simulations of random events are calculated thousands, millions, or even billions of times to analyze the most likely outcomes.
In this post, you learn how to improve Lambda cold start performance by up to 60% for functions running the Java runtime. Thanks to the recent changes in the Java execution environment, you can implement these optimizations by adding a single environment variable.
This optimization is suitable for Java workloads such as customer-facing APIs, just-in-time image resizing, near real-time data processing pipelines, and other short-running processes. For more information on tired compilation, read about Tiered Compilation in JVM.
For more serverless learning resources, visit Serverless Land.