AWS Fargate launches platform version 1.4.0

AWS Fargate is a managed service to run containers. Fargate allows customers to use Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS) to launch applications without the burden of having to deal with the undifferentiated heavy lifting of maintaining, patching, scaling, securing, life-cycling the infrastructure. While Amazon EC2 abstracts away hypervisors and physical servers from customers, AWS Fargate does the same for container runtimes and EC2 instances. If you want to read more about the role of Fargate in the container world, check out this blog post.

While Fargate makes the infrastructure disappear in the sense that the customer doesn’t need to think about it, the infrastructure still exists and it’s being managed by AWS. The way the infrastructure features surface to the end users today is through the notion of a Fargate platform version. You can read more about it in the Fargate documentation or you can read the Fargate platform versions primer blog post. The primer blog post goes into more detail about the philosophy behind why we introduced Fargate platform versions and, for example, the practical reasons why we are not tagging platform version 1.4.0 as LATEST just yet.

Today we are launching platform version 1.4.0 of AWS Fargate.

In this blog post, we are going to provide you with a summary of the Fargate features we are enabling with this release and some of the changes we are making underneath. These underlying changes don’t necessarily have a direct relationship with the new customer-visible features but they are just as important.

What’s new in Fargate platform version 1.4.0?

Platform version 1.4.0 introduces some new AWS Fargate capabilities. We are going to describe them in this section of the blog post.

Unless otherwise noted, the new features discussed in this blog post are relevant to the native Fargate platform and are directly consumable by the ECS orchestrator. If you want to read more about the role of Fargate in the container world and specifically its relation to ECS and EKS check out this blog post.

Note that EKS itself has a notion of platform versions, which is a mechanism that is used in EKS to track the various cluster features and configurations. EKS platform versions go above and beyond tracking the Kubernetes versions and include enhancements and additional features support. This includes features inherited by the native Fargate platform, some of which are being discussed in this blog.

Let’s get started.

Fargate tasks now support Elastic File System (EFS) endpoints

With platform version 1.4.0, we are introducing support for mounting persistent EFS storage inside Fargate tasks. This enables a full set of new use cases for AWS Fargate. This feature request had more than 1000 reactions on our open source container roadmap. In the true spirit of “customer obsession,” we acted and delivered.

For background, over the years, Amazon ECS customers have implemented custom scripting and solutions to provision zonal (e.g. EBS) and regional (e.g. EFS) persistent storage to EC2 container instances and configure tasks to consume that storage. The automation around making this work was a classic example of undifferentiated heavy lifting. AWS Fargate customers did not have the option to deploy stateful workloads because with Fargate there are no EC2 instances that you could access and configure.

Starting today, the ECS task definitions (for both EC2 and Fargate) support the new EFSVolumeConfiguration parameter. This means that:

ECS customers using the EC2 launch type no longer need to take care of the heavy lifting of configuring and automating storage on EC2 container instances.
AWS Fargate customers can now start running stateful workloads inside Fargate tasks, something that they couldn’t do before.

If you want to know more about how to use this capability please refer to this blog post.

Fargate tasks now have a consolidated 20GB ephemeral volume

Fargate up to platform version 1.3.0 used to have two ephemeral local volumes: a 4GB volume that could be used as a staging ephemeral area for containers running in the same ECS task and a 10GB volume to host container images.

With platform version 1.4.0, we are unifying these volumes into a single 20GB volume. Not only does this increase the total storage available but it also provides more flexibility for the users to consume this capacity as they prefer. For example, this unified larger volume is especially helpful for data processing applications that pull and process large files from Amazon S3.

As a reminder, these volumes are ephemeral. This means that when the task is stopped, data stored on the local file system of the task is lost and is unavailable for future use. If you are looking for a persistent storage option, please have a look at the newly introduced Fargate and EFS integration.

This is one of the few changes that applies directly to EKS pods running on Fargate. Note that the actual usable storage for EKS pods is going to be less than 20GB (around 19GB circa) because some of this space is being used by the Kubelet as well as other Kubernetes modules that are being loaded inside the pod when deployed on Fargate.

Task elastic network interface (ENI) now runs additional traffic flows

Fargate tasks run on a fleet of virtual machines that AWS manages on behalf of the customer. These VMs are connected to AWS owned VPCs via so called “Fargate ENIs”. When a user launches a task on Fargate, the task is assigned an ENI and this ENI is connected to the customer owned VPC. We call this ENI the “Task ENI.”

In addition to standard application traffic, a task has other types of network traffic such as logging, pulling images, and sourcing secrets from AWS Secrets Manager or AWS Systems Manager Parameter Store. Each traffic type has both a network path that controls which ENI is used and an associated IAM policy that needs to be specified as either a Task IAM Role or a Task IAM Execution Role.

Starting with Fargate platform version 1.4.0 we are making changes to two of these traffic flows to allow relevant traffic to stay confined inside the customer VPC. We are not changing the permission model.

This table summarizes the situation and calls out the changes we are making with Fargate platform version 1.4.0:

This change shifts visibility for those traffic flows. For example, previously, the Fargate ENI was being used to fetch secrets from Secrets Manager and Systems Manager. However that traffic is outside of your visibility. A number of customers told us that they wanted more control and visibility for these flows. Starting with 1.4.0 that traffic flows through the Task ENI. Task ENIs will inherit the networking connectivity patterns you have enabled in your own VPCs. This is important for customers that, for example, want to have visibility of that specific traffic inside VPC Flow Logs.

In terms of network connectivity, now either your VPCs need to allow outbound traffic to reach the same public endpoints or you need to configure Private Links for said services so that your Task ENI can reach the endpoint on your VPC. A practical example is that, previously on Fargate, if you were using private links for ECR, you only needed to set up the ecr.dkr endpoint. With platform version 1.4.0, you also need to set up the api.ecr endpoint.

Network performance metrics are available in CloudWatch Container Insights

When we launched Amazon CloudWatch Container Insights in 2019, we announced support for Amazon ECS, Amazon EKS, and AWS Fargate. Up until platform version 1.3.0, Fargate tasks could not report network performance metrics back to Container Insights. This was due to a limitation that exists at the intersection of the awsvpc networking mode and the ECS agent.

With Fargate platform version 1.4.0, we are shipping a revised stack (including the new Fargate agent) that enables tasks to report network performance metrics to Container Insights.

There is nothing that customers need to do other than start their Fargate tasks with this new platform version on an ECS cluster that has been enabled to use Container Insights. You now have full access to CPU, memory, disk, and network metrics for your AWS Fargate tasks provided you are running platform version 1.4.0.

Fargate tasks now support the CAP_SYS_PTRACE Linux capability

In 2017, we introduced support to add Linux capabilities to ECS Tasks. When we launched AWS Fargate the same year, we decided to disable these options because we wanted to minimize the surface area of attack to offer a secure platform.

We have since received feedback from customers (as well as partners that gravitate around security and compliance) that they could make good use of some of these capabilities (i.e. CAP_SYS_PTRACE). There are a number of observability tools that can help customers for which compliance is important to achieve the visibility they need. For example, some of these customers have expressed a need to run tools such as strace.

It is for this reason that, starting with the availability of Fargate platform version 1.4.0, we are allowing customers to enable this specific capability in their Fargate task definitions (across all available Fargate platform versions). Note that CAP_SYS_PTRACE is the only capability that can be added to Fargate tasks at this time. Other capabilities remain supported for tasks running on the EC2 launch type.

This has created already a lot of excitement in the partner community. For example, Sysdig are already taking advantage of this feature in their Falco product.

Network stats are now available in Fargate via the new task metadata v4

Starting with Fargate platform version 1.4.0, querying the metadata service inside the task will return networks metadata as well as networks stats for the task itself. This is going to be possible by simply querying the task metadata endpoint for the newly introduced task metadata endpoint version 4.

Up until platform version 1.3.0 networks metadata and networks stats were not available via the metadata service. Now Fargate customers have an easy way to access this information. You can query the metadata endpoint v4 to extract stats, including networks stats, using this command from a container running in a Fargate task:

curl ${ECS_CONTAINER_METADATA_URI_V4}/task/stats

These network statistics include data around the number of bytes and packets transmitted and received by the task network stack in addition to packets errors. These metrics are similar to what you could source with CloudWatch Container Insights but the availability of these network performance statistics through the metadata endpoint allows now third-party tools deployed as side-cars to export them and make them available for further analysis. Last year we could send logs natively from Fargate to Datadog. Now, Datadog has leveraged this new feature launching today to give customers added visibility over the network. Check out this updated blog post for more details about this new integration.

The Availability Zone (AZ) attribute is now available in the task metadata

Fargate tasks that use platform version 1.4.0 can now retrieve the AZ they are deployed to by querying the task metadata endpoint for all metadata versions.

Up until version 1.3.0 the attribute that specified the AZ was not available in the JSON returned by the metadata query for tasks running on Fargate. Now with the new platform version we have lifted this limitation and you will be able to introspect the AZ in which your Fargate task is running.

Containerd is replacing Docker as the container runtime

This is not a customer consumable feature per se but it’s a technology swap that will allow Fargate to innovate even faster. To date, Docker Engine has been used as a de-facto standard container runtime. Over time, Docker Engine has built an impressive stack of capabilities that turned a simple runtime into a platform. However, Fargate provides most of that functionality natively already and it doesn’t need all the bells and whistles that Docker provides. Because of this, the runtime below can remain simple and using Docker Engine was deemed to be unnecessary. This is why we are swapping Docker Engine with Containerd. Interestingly enough, Docker Engine builds on top of Containerd to provide all the advanced functionalities it provides (and that Fargate doesn’t need). So this isn’t technically a swap but more like a reduction of the code footprint, if you will. Which means far fewer things to keep secure.

If you want to dive deeper into this change, please refer to this blog post that dives deep into this topic.

The Fargate agent is replacing the ECS Agent

Up until platform version 1.3.0, the agent that Fargate was using was the standard ECS agent. With the availability of the 1.4.0 platform version, we are introducing a new agent that’s purpose-built for the Fargate environment and will allow us to drive innovation faster for customers. The combination of containerd and this new agent will also enable the new architecture based on Firecracker.

If you are interested in understanding more about how we are making Fargate evolve under the cover, please watch this deep dive from the Fargate engineering team.

Some error messages are changing

Due to the changes in the container runtime and in the agent running inside the VM, we are making changes specific to some of the exit reasons for containers and tasks. For example, because we no longer run a Docker runtime, a DockerTimeoutError error message doesn’t make sense and we are replacing it with ContainerRuntimeTimeoutError. For the complete list of Fargate error messages, refer to the documentation.

Task ENIs support jumbo frames for improved networking efficiency

Network interfaces are configured with a maximum transmission unit (MTU), which is the size of the largest payload that fits in a single frame. The larger the MTU, the more application payload can fit in a single frame, reducing per-frame overhead and increasing efficiency.

Previously, task ENIs would send and receive traffic with the standard Ethernet frame size of 1500 bytes. Starting with platform version 1.4.0, task ENIs support jumbo frames like all other VPC ENIs. This increases efficiency and reduces compute overhead whenever the network path between source and destination supports jumbo frames, such as all traffic that remains within VPC.

Conclusions

In this blog post, we have summarized the new features and changes we have introduced with Fargate platform version 1.4.0.

As always, we are eager to hear what you think about what we build as well as what you want us to do next. If you are a Fargate customer and would like to submit a Fargate feature request, please do so on our public container roadmap. If you are new to AWS Fargate, this is a good starting point.

Containers