Under the hood: Lazy Loading Container Images with Seekable OCI and AWS Fargate
November 2023: AWS Fargate now supports having both SOCI and non SOCI enabled containers in the same Amazon ECS task, therefore the “All container images within an Amazon ECS Task need a SOCI Index Manifest” restriction no longer applies. To learn more see the whats new post.
AWS Fargate, a serverless compute engine for containerized workloads, now supports lazy loading container images that have been indexed using Seekable OCI (SOCI). Lazy loading container images with SOCI reduces the time taken to launch Amazon Elastic Container Service (Amazon ECS) Tasks on AWS Fargate. Donnie Prakoso’s launch post provides details on how to get started with AWS Fargate and SOCI, therefore is recommended before reading this post.
In this post, we’ll dive into SOCI and how it can index a container image without modifying its contents or requiring a change to existing tools or workflows. We will discuss the SOCI snapshotter, a remote containerd snapshotter that leverages SOCI Indexes to lazy load container images. And finally, we will cover some of the caveats when using SOCI on AWS Fargate.
How does Seekable OCI work?
In containerd the component that manages the container’s filesystem is called a snapshotter. The default snapshotter, overlayfs, pulls and decompresses the entire container image before a container can be started. With lazy loading snapshotters (such as stargz or SOCI snapshotter), the container starts without downloading the entire container image and instead lazily loads files from an OCI compatible registry, like Amazon Elastic Container Registry (Amazon ECR). As the container is started without waiting for the full container image to be downloaded, the launch time is often shorter when compared to overlayfs. With overlayfs there is a correlation between the time taken to pull an image and the size of the container image. Therefore, with lazy loading snapshotters the speedup relative to overlayfs increases as the container image size increases.
Before the SOCI snapshotter can lazily load a container image it needs to have metadata about the images’ contents. Container images consist of several container image layers, stored in an OCI compatible registry as compressed tarballs. For the SOCI snapshotter to be able to lazy load the container image, it needs to know which files are in each layer, where within the compressed tarball they are stored, and how to decompress just the files that the application needs. In SOCI all this metadata is stored in a SOCI Index.
In the launch post, Donnie showed how the
soci create command indexes a container image and creates a SOCI Index Manifest. The
soci push command pushes the SOCI Index Manifest to an OCI compatible registry, ready to be used by the SOCI snapshotter. Let’s dive into those 2 steps in more detail.
Creating a SOCI Index Manifest
When you run
soci create, behind the scenes a zTOC (a piece of SOCI metadata) is created for each container image layer. This zTOC is broken up into 2 parts:
- Table of Contents (TOC) – This table contains a list of all the files in that layer and an offset to where in the tarball that file can be found.
- zInfo – In SOCI the compressed tarball is split into spans, which are logical chunks. The list of spans is stored in a table, known as the zInfo. Each span can be independently retrieved through a ranged GET request to the registry, and subsequently decompressed using the data stored in zInfo. Using spans, the SOCI snapshotter can download only the files that it needs from the registry before the background process has retrieved the full tarball.
For each indexed container image, the
soci cli will create a SOCI Index Manifest including all the zTOCs for the container image, along with a reference to which container image it relates to. The SOCI Index Manifest can be viewed locally with the
soci index info command.
The two notable parts of the SOCI Index Manifest are:
layers– instead of a list of container image layers, in a SOCI Index, the layers are a list of all the zTOCs for that corresponding container image. An annotation also shows which container image layer that zTOC corresponds to.
subject– This identifies the container image that this SOCI index refers to by specifying digest, media type, and size of the container image manifest. It is up to the client, which in this case is the SOCI snapshotter, to read the subject field and know which Image the SOCI index refers to.
The diagram below shows the relationship between a SOCI Index Manifest and a Container Image Manifest.
Pushing a SOCI Index Manifest
soci push pushes a SOCI Index Manifest and all of the zTOCs (one for each container image layer) to an OCI compatible registry. Alongside the SOCI Index Manifest a second manifest, an OCI Image Index, is pushed to the registry. If the OCI Image Index already exists in the registry it is updated. The OCI Image Index manifest has the tag
sha-<digest-of-container-image> and contains a list of all the SOCI Indexes associated with the container image.
This OCI Image Index allows client-managed references for a container image, which can be used when registries only support the OCI 1.0 distribution specification. When the OCI 1.1 image and distribution specifications have been released, the
soci cli will stop pushing this second artifact for registries that support the referrers API.
Running a Container with SOCI Index
Before a container with an indexed container image starts, the SOCI snapshotter will download the SOCI Index Manifest and all zTOCs to the container host. It will also create a FUSE filesystem for each container image layer. Additionally, after a container has started, the snapshotter will start a background process to move the full container image to local storage. When the workload attempts to access a file that does not yet exist locally, the snapshotter will do the following walk:
- The SOCI snapshotter first does a lookup to find the layer that contains the file.
- The snapshotter retrieves the file’s position within the layer tarball from the TOC.
- The snapshotter uses the offset and the zInfo table to find the set of spans (i.e. logical chunks of the compressed tarball) that contain that file’s data.
- Finally, it downloads and decompresses just the necessary spans and returns the file’s data. Once a span has been downloaded, SOCI uses local caching to ensure it only downloads and decompresses each span once.
Automating the generation of SOCI Index Images
One of the main goals of the SOCI project was to enable lazy loading without customers having to change their existing workflows and tooling. Indexing a container image with SOCI does not modify the container image data, preserving the chain of trust that exists with existing signing and digest verification processes. We also knew customers would want to automate the generation of SOCI Indexes. Therefore, alongside the launch of lazy loading container images on AWS Fargate with SOCI, we have released the SOCI Index Builder project as part of AWS Infrastructure Automation.
The SOCI Index Builder provides a blueprint to automate the creation of a SOCI Index when a container image is pushed to Amazon ECR. The source code of this project is open source and available on GitHub. The blueprint is an AWS CloudFormation template, consisting of an Amazon EventBridge Rule and two AWS Lambda Functions. Upon a successful container image pushed to an Amazon ECR Repository, the Amazon EventBridge rule triggers the AWS Lambda Functions. The first function will validate the container image, the second will generate a SOCI Index and push it to ECR to sit alongside the container image. Providing a completely hands off way to create SOCI Indexes.
To get started with the project, and deploy it into your account, see the documentation hosted here.
Alongside the SOCI Index Builder, we have also created a SOCI snapshotter on AWS Fargate toolbox repository to showcase how the generation of SOCI Indexes could fit into other existing workflows. Inside of the sample repository there are two tools:
- A containerized index builder – Customers leveraging the Docker Engine as part of their development workflow, for example those running Docker Desktop, will find their container images are not found by the
socicli. This is because the Docker Engine by default stores the container images in the Docker Engine image store, not the containerd image store. Work is being done in the Moby project to move the Docker Engine image store to containerd, but at the time of writing the containerd image store is not yet the default.To help customers create a SOCI Index when leveraging the Docker Engine, this containerized image builder can be used to create a SOCI Index. The tool works by running containerd inside a container (like Docker in Docker), pulling down a container image from a remote registry, indexing the container image, and pushing the index back to the OCI compatible registry.
- AWS Code Pipeline Demo – Customers are often building their production container images as part of a continuous integration continuous delivery pipeline. In the toolbox repository there is a blueprint for AWS CodePipeline to show how a container image can be built and indexed as part of a CI/CD pipeline. This example hopes to provide inspiration on how to integrate SOCI into existing CI/CD pipelines.
Lazy Loading Container Images on AWS Fargate
Lazy Loading container images on AWS Fargate has been shown to reduce the time taken to start new Amazon ECS Tasks; however, not all workloads and container images will see a benefit. During internal testing we’ve seen that large (> 250 MB) container images see the greatest benefit from SOCI. However, if the workload frequently accesses filesystem metadata, or if the workload needs to access all the image data quickly after application startup, the benefits of SOCI are reduced. We continue to iterate to improve performance here, increasing the number of workloads that can take advantage of SOCI.
In this section, we’ll dive into some of the specifics of the Lazy Loading Container Images on AWS Fargate implementation.
All container images within an Amazon ECS Task need a SOCI Index Manifest
The SOCI snapshotter on AWS Fargate is enabled for all Amazon ECS Tasks deployed on to Linux platform version 1.4. There is no need to set a flag to enable or disable SOCI within a Task Definition. Instead, Fargate checks if a SOCI Index Manifest exists in the OCI compatible registry for each of the container images defined within a Task Definition. If Indexes are found, then AWS Fargate lazily loads the container images. If any container image is missing a SOCI Index, AWS Fargate will default to pulling containers images entirely before starting the containers.
To reiterate this point, at the time of writing if there is just one container in the Task, generate a SOCI Index for this one container image. If there is one workload container and a logging sidecar container in the Task, generate a SOCI index for both the workload container image and the logging sidecar container image.
Identifying if the container image has been lazy loaded
Within an Amazon ECS Task there is a Task Metadata endpoint. The Task Metadata Endpoint contains useful information about the Amazon ECS Task, including the container specification, the usage metrics, and now which containerd snapshotter has been used to launch the containers. The endpoint can be used within a running Task to determine if it has been lazy loaded or not.
$ curl -s $ECS_CONTAINER_METADATA_URI_V4 | jq -r '.Snapshotter' soci $ curl -s $ECS_CONTAINER_METADATA_URI_V4/task | jq -r '.Containers | ..Snapshotter' soci
To prevent customers from having to modify their applications to consume this endpoint, an example init container that queries this endpoint and puts the information into AWS CloudWatch Logs can be found in the SOCI snapshotter on AWS Fargate toolbox repository.
Container Image sizes
As a container image increases in size, so does the time taken to fully download the image (hence why bigger container images proportionally affect the AWS Fargate launch time). In our internal testing SOCI has the largest impact on reducing Task launch times when used with large (>250 MB) container images.
For small container images, SOCI will have less of an impact, and may even slow down the time taken to launch AWS Fargate Tasks. This is because there is an overhead to download SOCI artifacts and setting up a FUSE filesystem. In our testing we have found SOCI starts to have a noticeable impact when container images are greater than 250MB compressed. For container images <250MB, the initial lazy loading overhead may be greater than the time taken to pull the full container image using traditional methods.
Container Image layer sizes
When creating a SOCI Index with
soci create, there’s a parameter that controls the minimum size of a container image layer that the client will generate a zTOC for. By default, this is a 10MB layer, which is tuned with the
--min-layer-size flag. When creating a SOCI Index when the client detects a small layer, it skips the generation of a zTOC for that layer, and that layer won’t be lazy loaded at runtime. Internal testing has shown that it is more performant to simply download the entire small layer at launch time, rather than lazily loading it. In small container images, if all layers in the container image are under 10MB, then no SOCI Index is created.
It is worth remembering when creating SOCI Indexes for side car containers. All container image layers in a logging or monitoring container image may be under 10MB. Therefore, to create a SOCI Index for them this parameter may need to be tuned.
In this post we have gone under the hood into Seekable OCI and the SOCI snapshotter. We have dived into the SOCI Index, how it inventories the contents of a container image, and investigated ways it can be automated both in AWS and in existing CI/CD pipelines. Finally, we explored some of the caveats when using SOCI on AWS Fargate today.
The SOCI snapshotter builds on the foundational work in lazy loading container images first discussed in Google’s CRFS project and then later in the stargz snapshotter. With this week’s launch on AWS Fargate, customers can now lazily load container images in Amazon ECS Tasks on AWS Fargate. In the future, we’ll make it easier to run SOCI with other container orchestrators. We encourage you to get involved and follow the ongoing developments in the soci-snapshotter project on GitHub.