Start Spring Boot applications faster on AWS Fargate using SOCI

About a year ago, we published a post on how to Optimize your Spring Boot application for AWS Fargate, where we went into different optimization techniques to speed up the startup time of Spring Boot applications for AWS Fargate. We started the post with “Fast startup times are key to quickly react to disruptions and demand peaks, and they can increase the resource efficiency”. Seekable OCI (SOCI) is a new and simple way to reduce startup times for Java workloads running on AWS Fargate. It can be combined with the earlier optimizations, or you could just use SOCI for a simple win. Customers running applications on Amazon Elastic Container Service (Amazon ECS) with AWS Fargate can now use SOCI to lazily start or in other words: start without waiting for the entire container image to be downloaded. SOCI starts your application immediately and downloads data from the container registry when requested by the application. This improves the overall container startup time. A great deep dive about SOCI and AWS Fargate can be found here.

In this post, we’ll dive into techniques to optimize your Java applications using SOCI that don’t require you to change a single line of Java code. In our Spring Boot example application, this improves application startup time by about 25%, and this improvement should get bigger as the container image size gets larger. In addition, we’ll also take a closer look at benchmarks for two different frameworks and approaches. You don’t have to rebuild your images to use SOCI. However, during our tests we also identified optimizations for Java applications only require small modifications to the build process and Dockerfile (i.e., the actual application doesn’t need any adjustments) and reduce the startup time of an AWS Fargate task even further. While we’re focused on Java applications today, we expect SOCI to be helpful for any cases where customers deploy large container images. This testing was carried out with a sample application and the layered jar with SOCI approach may not improve launch times for all Spring Boot applications. We recommended testing this approach with your application and measuring the impact in your environment.

Solution overview

In this section we’ll introduce the sample application and the AWS architecture used in the benchmarking. If you want to see the application code and the AWS Cloud Development Kit (AWS CDK)-application to deploy the architecture, then you can visit this GitHub repository.

Our example application is a simple REST-based Create Read Update Delete (CRUD) service that implements basic customer management functionalities. All data is persisted in an Amazon DynamoDB table accessed using the AWS SDK for Java V2.

The REST-functionality is located in the class CustomerController, which uses the Spring Boot RestController-annotation. This class invokes the CustomerService, which uses the Spring data repository implementation CustomerRepository. This repository implements the functionalities to access an Amazon DynamoDB table with the AWS SDK. All user-related information is stored in a Plain Old Java Object (POJO) called Customer.

The application has several dependencies, including AWS Software Development Kit (AWS SDK), the DynamoDB enhanced client, and Lombok. Project Lombok is a code generation library tool for Java to minimize boilerplate code. The Amazon DynamoDB-enhanced client is a high-level library that’s part of the AWS SDK for Java V2 and offers a straightforward way to map client-side classes to Amazon DynamoDB tables. In addition, we use Tomcat as the web container.

In our Dockerfile, we use a multi-stage build approach with a target image based on Amazon Corretto 17 (public.ecr.aws/amazoncorretto/amazoncorretto:17), which is a no-cost, multiplatform, production-ready distribution of the Open Java Development Kit (OpenJDK).

Infrastructure for the example application

Figure 1: Infrastructure for the example application

The Java application runs as an Amazon Elastic Container Service (Amazon ECS) service in an Amazon ECS cluster with AWS Fargate. We use the Amazon ECS service to run and maintain a specified number of instances of a task definition simultaneously in our Amazon ECS cluster. A security group is defined in the Amazon ECS task definition, and this controls the network traffic allowed for the resources in your virtual private cloud (VPC). The container image containing the application is stored in Amazon Elastic Container Registry (Amazon ECR). As indicated earlier, the state of the application is stored in an Amazon DynamoDB table.

Walkthrough

Performance considerations

We compared the effect of SOCI with two different approaches to packaging the application: Uber JARs and layered JARs. These packaging approaches are relevant as they have an effect on the amount of files in the filesystem that represent our application.

In containerd the component that manages the container’s filesystem is called a snapshotter. The default snapshotter is overlayfs. It pulls and decompresses the entire container image before a container can be started. With lazy loading snapshotters (such as stargz or SOCI snapshotter), the container starts without downloading the entire container image. Java applications often have packaged dependencies that aren’t used at all. According to this study, “only 6.4% of the data transferred by a pull is actually needed before a container can begin useful work”. This is how lazy loading helps improve container startup time as it downloads files that are actually needed to start the container and avoids the download of redundant ones.

The Uber JAR approach archives the application and dependencies in a single file

Uber JARs contain not only the Java application, but its dependencies as well, in a single file. If the class with the main-method is defined within the Manifest-file, a convenient Java-command like java -jar myApplication.jar is sufficient to start the application. With the Uber JAR-based approach lazy loading always loads and extracts the complete file. Our Uber JAR-based container image has a compressed size of 875 MB. The size of the image affects the pull time of the image from the registry as well as the startup time of the application.

The layered JAR approach separates the application and dependencies into layers

Container images are created using layers. Layered JAR files separate the application and its dependencies so that each part can be stored in a dedicated container image layer. This has the advantage that the cached layers can be reused during the build of the application, which significantly speeds up the rebuild of the container image.

With version 2.3, Spring Boot introduced support for layered JAR files. One of the essential requirements for this is the layers.idx file. This file contains a list of layers and the parts of the JAR that should be contained there. The order of the layers is certainly important, as this has a significant influence on how caches are used during the build process of the container image. The default order is dependencies, spring-boot-loader, snapshot-dependencies, and application. For our benchmarking, we changed the Dockerfile and the Maven build. The container image has a compressed size of 790 MB.

The following snippet from the pom.xml-file shows how layering has been activated:

<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
     <configuration>
         <layers>
             <enabled>true</enabled>
         </layers>
    </configuration> 
</plugin>

With the following command, you can list layers from the JAR that can be extracted:

$ java -Djarmode=layertools -jar CustomerService-0.0.1.jar list

dependencies
spring-boot-loader
snapshot-dependencies
application

The layers index file layers.idx contains a list of different layers and corresponding parts of the JAR which should be included:

$ unzip -p CustomerService-0.0.1.jar BOOT-INF/layers.idx

- "dependencies":
    - "BOOT-INF/lib/"
- "spring-boot-loader":
    - "org/"
- "snapshot-dependencies":
- "application":
    - "BOOT-INF/classes/"
    - "BOOT-INF/classpath.idx"
    - "BOOT-INF/layers.idx"
    - "META-INF/"

In the next step of the build, we have to extract the layers, this can also be automated in the Dockerfile using:

$ java -Djarmode=layertools -jar CustomerService-0.0.1.jar extract

All steps described are part of the first step in our multi-stage build process. In the next step of the build, we copy the created file from the build container image to the target image in corresponding layers. The complete Dockerfile can be seen below:

FROM maven:3-amazoncorretto-17 as builder

COPY ./pom.xml ./pom.xml
COPY src ./src/
RUN mvn -Dmaven.test.skip=true clean package && \ 
cd target && \
java -Djarmode=layertools -jar CustomerService-0.0.1.jar extract

FROM public.ecr.aws/amazoncorretto/amazoncorretto:17
RUN yum install -y shadow-utils

WORKDIR application

RUN groupadd --system spring 
RUN adduser spring -g spring 

USER spring:spring

COPY --from=builder target/dependencies/ ./
COPY --from=builder target/spring-boot-loader/ ./
COPY --from=builder target/snapshot-dependencies/ ./
COPY --from=builder target/application/ ./

EXPOSE 8080

ENTRYPOINT ["java", "org.springframework.boot.loader.JarLauncher"]

Measurement and results

We want to find out the impact of lazy loading on the startup time of the tasks, so we measure the task readiness duration shown below for the AWS Fargate task. This can be calculated using the timestamp of the runTask-API call in AWS CloudTrail and the timestamp of the ApplicationReadyEvent in our Spring Boot-application.

To measure the startup time, we use a combination of data from the task metadata endpoint and API calls to the control plane of Amazon ECS. Among other things, this endpoint returns the task ARN and the cluster name. We need this data for describeTasks-calls to the Amazon ECS control plane in order to receive the following metrics:

PullStartedAt: The Unix timestamp for the time when the container image pull began.
PullStoppedAt: The Unix timestamp for the time when the container image pull completed.
CreatedAt: The Unix timestamp for the time when the task was created. This parameter is omitted if the container has not been created yet.
StartedAt: The Unix timestamp for the time when the task started. This parameter is omitted if the container has not started yet.
SpringDuration: The time period between start of the JVM and ApplicationReaderEvent (application is ready to service requests)

The logic for pulling the necessary metrics is implemented in the EcsMetaDataService class.

The different states and measured durations of our AWS Fargate tasks are shown in the following diagram.

Different states of our AWS Fargate tasks

Figure 2: Different states of our AWS Fargate tasks

Figure 3 shows the results of our Spring Boot application with a 1vCPU and 2 GB memory task configuration in the AWS region eu-west-1. The upper, blue chart shows the absolute durations of 500 container startups as a box plot. Each box represents the interquartile range (IQR) between the 25th and 75th percentiles of measured durations. The white line in between is the median. The vertical black lines represent measured durations within the 1.5 IQR. Measured durations outside are plotted as circles. The lower, orange chart shows the median change relative to the baseline performance of the Uber JAR approach without SOCI.

Spring Boot results

Figure 3: Spring Boot results

When we take a closer look at the complete starting time, beginning with the runTask-API call and ending with the ApplicationReadyEvent, we can see a consistent performance improvement of 25.6% of the application startup time when we compare the non-SOCI Uber-JAR-version with the SOCI layered-version.

The effect on Quarkus based applications

We also implement an application version based on Quarkus to test the impact of lazy loading on a container-first platform that offers optimized startup times and low memory consumption by default. The compressed size of our container image for the Quarkus-application in Amazon ECR is about 175 MB. Large container images (> 250 MB) usually see the greatest benefit from SOCI. Nevertheless, we wanted to measure whether a Quarkus application could also benefit from SOCI. When an application is set up with Quarkus, different Dockerfiles are already created under src/main/docker. The Dockerfile.legacy-jar file creates a classic Uber JAR which showed 13% improvements in total startup time when using SOCI.

The Dockerfile for the layered approach (Dockerfile.jvm) has the following content:

FROM registry.access.redhat.com/ubi8/openjdk-17:1.16

ENV LANGUAGE='en_US:en'

# We make four distinct layers so if there are application changes the library layers can be re-used
COPY --chown=185 target/quarkus-app/lib/ /deployments/lib/
COPY --chown=185 target/quarkus-app/*.jar /deployments/
COPY --chown=185 target/quarkus-app/app/ /deployments/app/
COPY --chown=185 target/quarkus-app/quarkus/ /deployments/quarkus/

EXPOSE 8080
USER 185
ENV JAVA_OPTS="-Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager"
ENV JAVA_APP_JAR="/deployments/quarkus-run.jar"

This is very similar to the contents of the Spring Boot-Dockerfile from above. Here, too, files are arranged in four different layers, which means that rebuilding a container image is quick because of the caches and that the corresponding indices can be used when pulling from Amazon ECR.

We measured the results for 500 tasks with the same configuration as with our Spring Boot-based application. For Quarkus, we use the StartupEvent, which isn’t completely the same as Spring Boot’s ApplicationReadyEvent. As we are only comparing between different Quarkus based packaging approaches, this is sufficient for our purpose.

Quarkus results

Figure 4: Quarkus results

The performance gains with the Quarkus application aren’t quite as high as with Spring Boot, which is understandable. This is because for container images of less than 250 MB in size, the initial lazy loading overhead may be greater than the time taken to pull the full container image using traditional methods. However, we can see a consistent performance improvement of 14% in application startup time when we compare the Uber-JAR non-SOCI-version with the layered-JAR SOCI version.

Tradeoffs

SOCI shortens the pull time with the tradeoff of longer application startup time as the files are loaded lazily. Some parts don’t need to be read or transferred until the application needs them — if at all. However, SOCI pulls everything in the background eventually. Thus, you should still keep the size of your build artifacts and number of extraneous files low to reduce storage cost, data transfer cost, and startup times. This is especially true, because optimization benefits add up with the number of versions you build of your applications over time.

Currently, SOCI doesn’t work with zstd compressed images. All container images in the task definition must have SOCI indexes in the same container registry as each image. If a single image in the task definition is missing a SOCI index, then the task launches without SOCI.

Conclusion

In this post, we showed you the impact of SOCI on the startup time of a Spring Boot- and Quarkus-application running on Amazon ECS with AWS Fargate. We tested these on AWS Fargate, but we expect layered JARs to be generally a useful approach for improving the launch time performance of large Java applications deployed via containers.

Initially, we started with an Uber-JAR based approach for our applications and SOCI had improved the startup time of the application by almost 17%. When we changed the layout of the application and the Dockerfile to a layered JAR, we could see a 25% reduction of startup time for the Spring Boot. For our Quarkus application we could achieve 14% better startup time using a layered approach compared to a 13% improvement using an Uber JAR with SOCI. This comes for free, you only have to changed your Maven- or Gradle-build and your Dockerfile for your Spring Boot-application, if you still use an Uber-JAR. If you use a standard Quarkus-application, you don’t have to change anything.

We hope we’ve given you some ideas on how you can optimize your existing Java application to reduce startup time. Feel free to submit enhancements to the example application in the source repository.

Containers