How Continental uses Mountpoint for Amazon S3 in autonomous driving development – accelerating simulation performance by 20%

Continental and AWS have been collaborating to create the Continental Automotive Edge (CAEdge) framework – a modular hardware and software environment that connects the vehicle to the cloud and features virtual workbenches. The platform includes a virtual workbench offering numerous options to develop, supply, and maintain software-intensive system functions. It supports a wide range of automotive software development use cases, one of which is the development and validation of Advanced Driver-Assistance Systems/Autonomous Vehicle (ADAS/AV) functions.

The validation of ADAS/AV vehicle function development requires large sets of input data stored in Amazon Simple Storage Service (Amazon S3) to be re-simulated in compute workloads running on AWS Batch or Amazon Elastic Kubernetes Service (Amazon EKS). A validation run replays recordings that were collected in test vehicles to the system under tests, and evaluates the outputs. For example, an emergency brake assist function developed by Continental would be validated on many hundreds of thousands to millions of miles recorded in test vehicles.

Mountpoint for Amazon S3 is a new open source file client that makes it easy to mount an S3 bucket on your compute instance and access it as a local file system. It translates local file system API calls to REST API calls on S3 objects. Using Mountpoint for Amazon S3, you can achieve high single-instance throughput to finish jobs faster, saving on compute costs. You can reliably read large datasets across several instances, making it an ideal choice to process petabytes of data in ADAS/AV use cases. Mountpoint for Amazon S3 supports sequential and random read operations on existing files, and sequential write operations for creating new files.

In this post, we explore how Mountpoint for Amazon S3 improves Continental’s existing large-scale simulation and Amazon S3 integration architecture, improving overall I/O performance for simulation workloads and reducing operational complexity by removing the need for external data movement orchestration. Furthermore, the architecture’s cost efficiency is improved, as Mountpoint for Amazon S3 helps Continental move away from larger Amazon Elastic Compute Cloud (Amazon EC2) instance families (specifically configured for high I/O throughput) and switch to general purpose EC2 instance families instead. In summary, the new architecture enabled by Mountpoint for Amazon S3 delivers 20% higher simulation performance and 30% higher cost efficiency.

Improvement of simulation architecture

In a previous post we explored architecture options for an efficient integration of Amazon S3 with compute workloads on AWS Batch. The proposed approach, while efficient, requires high-level orchestration to enable moving data from Amazon S3 to the compute environment before processing, storing validation results, and cleaning up data after the validation run. To ensure high I/O performance, workloads are executed on the M5d instance family leveraging a RAID0 striping setup. This requires a customized setup procedure of the underlying host during instance launch.

Even though the architecture provides performance benefits, the operational complexity and costs of the approach are higher compared to general purpose instance families and direct Amazon S3 integration. Furthermore, with average object sizes of 100 GB, the data movement operations add an average of 15 minutes per simulation run.

With the introduction of Mountpoint for Amazon S3, we can revise the proposed architecture to directly interface with data on Amazon S3 without having to move data to the simulation workloads. This architecture removes the need for data movement orchestration and for the underlying instance to have fast local storage (RAID0 drive setup), thereby enabling the usage of general purpose EC2 instance families. With this, the overall architecture is more cost efficient and performant as compared to the previous approach.

In the following sections, we explore the solution in detail.

This image describes the Solution Overview for integration with Mountpoint for Amazon S3. The proposed solution has three key components: Data storage and access management, Simulation control plane and scalable compute backend and Cost dashboard and usage data over AWS Cost and Usage Reports.

Figure 1: Solution overview for integration with Mountpoint for Amazon S3

The proposed solution in the preceding figure has three key components:

Data storage and access management: All raw sensor data, including the data format Robotic Operating System (ROS) bag and MDF4, is stored in the Amazon S3 Intelligent-Tiering storage class. Extracted metadata is stored in Amazon DynamoDB. AWS Lake Formation provides fine-grained access control at row-level or column-level to AD function developers and data analysts.
Simulation control plane and scalable compute backend: This post illustrates how CAEdge Framework orchestrates ADAS simulation workloads with Amazon Managed Workflows for Apache Airflow (MWAA) and AWS Batch for elastic, highly scalable, and customizable compute needs. An alternative approach with Amazon MWAA and Amazon EKS is implemented based on top of the modules of the Industry Data Framework (IDF), which provide the core infrastructure for Autonomous Driving Data Framework ADDF). Both IDF and ADDF are open source projects.
Cost dashboard with Amazon QuickSight and usage data over AWS Cost and Usage Reports (CUR): The CUR contains the most comprehensive set of cost and usage data available. These reports break down your costs by the hour, day, or month, by product or product resource, or by tags that you define. This information can be visualized in Amazon QuickSight and it provides Generative BI capabilities in QuickSight Q through Amazon Bedrock.

AWS Batch integration in Continental’s CAEdge framework

To validate ADAS/AV vehicle functions on Continental’s CAEdge framework, we must run thousands of containers with fine-grained control over the underlying compute instance’s configuration. Furthermore, we must provide the applications access to objects in Amazon S3 through a file interface.

In production, typical validation workloads process approximately 3,000 to 5,000 recordings in Amazon S3 in parallel from multiple Docker containers per recording on AWS Batch on Amazon EC2, with average recording sizes of 100 GB. In this scenario, a total of up to 500 TB of input data in Amazon S3 must be processed in parallel for a single validation job. To achieve this efficiently, we rely on AWS Batch’s array job feature to run extremely parallel jobs, such as simulations, parametric sweeps, or large rendering jobs.

From our simulation containers on AWS Batch, we use Mountpoint for Amazon S3 to integrate with Amazon S3.

Prerequisites

To leverage Mountpoint for Amazon S3 from a Docker container, we’re following the installation instructions found in this GitHub readme. Depending on the Amazon S3 access patterns, we can choose between two variants.

If your simulation containers must access dynamic locations on Amazon S3, then we can modify the Docker container and mount dynamic Amazon S3 locations for each container individually (see variant 1 in the following section). Alternatively, if all your Docker containers on AWS Batch must access the same location or prefix on Amazon S3, then we can modify the underlying host on Amazon EC2 to mount a static Amazon S3 location, which the containers on AWS Batch can access through the file system (see variant 2 in the following section).

Variant 1 – Access dynamic Amazon S3 locations from Docker containers on AWS Batch

For containers to access dynamic locations on Amazon S3, we must add the Mountpoint for Amazon S3 client installation to the Dockerfile. At runtime, we can pass an environment variable containing the target Amazon S3 location to the container, mount the location, and then access the bucket’s content through the file system. An advantage of this approach is the ability to dynamically mount Amazon S3 locations for individual containers. However, a disadvantage is the requirement to modify the underlying container to contain the Mountpoint for Amazon S3 client and having to run the container in privileged mode.

The following is an example Dockerfile snippet for DEB-based distribution (Debian, Ubuntu). Additional examples for other distributions can be found in this GitHub doc.

FROM debian:bullseye-slim

RUN apt-get update && apt-get install -y \
    fuse \
    libfuse2 \
    wget \
 && rm -rf /var/lib/apt/lists/* \
 # download mountpoint binary for for x86
 && wget https://s3.amazonaws.com/mountpoint-s3-release/x86_64/latest/mount-s3.deb \
 # download mountpoint binary for ARM 64
 # && wget https://s3.amazonaws.com/mountpoint-s3-release/arm64/latest/mount-s3.deb \
 && apt-get install -y ./mount-s3.deb \
 && mount-s3 --version
 
ENV S3_LOCATION=none

Using the preceding Dockerfile, build and upload a Docker image to Amazon Elastic Container Registry (Amazon ECR). Create an AWS Batch job definition referencing the Docker image and enable privileged mode in Linux and logging settings, as shown in the following figure:

This image shows how to enable the privileged mode in Linux and logging settings through AWS Console.

With the following Python snippet, you can now submit AWS Batch jobs referencing the new job definition with environment variable overrides using boto3:

job_name = "my_job"
    job_definition = "my_job_definition"
    job_queue = "my_job_queue"
    
    env_variables = [
        {"name": "S3_LOCATION", "value": "DOC-EXAMPLE-BUCKET"},
    ]

    array_size = 500

    client = boto3.client("batch")
    response = client.submit_job(
        jobName=job_name,
        arrayProperties={"size": array_size},
        jobQueue=job_queue,
        jobDefinition=job_definition,
        containerOverrides={"environment": env_variables},
        retryStrategy={"attempts": 5},
    )

Then, at runtime, the Docker container can evaluate the environment variable and mount the Amazon S3 location to the folder /mnt:

sudo mount-s3 $S3_LOCATION ~/mnt

With this approach, containerized applications can access dynamic Amazon S3 locations from /mnt within the container’s file system.

Variant 2 – Access a static Amazon S3 location from Docker containers on AWS Batch

If all your Docker containers running on AWS Batch must access the same, static Amazon S3 location, then you can use an Amazon EC2 launch template to provide a custom user data section, in which you can install the Mountpoint for Amazon S3 client according to the preceding installation instructions and mount an Amazon S3 location. The launch template is used to create an AWS Batch compute environment. After that, the mounted file path can be passed to Docker containers in AWS Batch by providing a volume configuration in an AWS Batch job definition. One advantage of this approach is that the Docker containers are unaware of the mounting procedure and don’t need to be modified. However, this comes at the cost of additional operational complexity in the user data section.

The following is an example AWS Cloud Development Kit (AWS CDK) snippet for an EC2 launch template and corresponding AWS Batch compute environment construct:

USR_DATA_STR = """MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash

set -x

# install Mountpoint for Amazon S3 client
sudo yum install fuse fuse-devel wget

# download mountpoint x86 binary
wget https://s3.amazonaws.com/mountpoint-s3-release/x86_64/latest/mount-s3.rpm

# download mountpoint ARM64 binary
# wget https://s3.amazonaws.com/mountpoint-s3-release/arm64/latest/mount-s3.rpm

# install binary
sudo yum install -y ./mount-s3.rpm

# mount static S3 location
mkdir /mnt
mount-s3 DOC-EXAMPLE-BUCKET /mnt
"""

launch_template_data = aws_ec2.CfnLaunchTemplate.LaunchTemplateDataProperty(
    user_data=core.Fn.base64(USR_DATA_STR),
    monitoring=aws_ec2.CfnLaunchTemplate.MonitoringProperty(enabled=True)
)

launch_template = aws_ec2.CfnLaunchTemplate(
    self, f"ec2-launch-template", launch_template_data=launch_template_data,
)

launch_template_props = aws_batch.CfnComputeEnvironment.LaunchTemplateSpecificationProperty(
    launch_template_id=launch_template.ref
)

compute_environmet = aws_batch.CfnComputeEnvironment(
    self,
    id=id,
    type="MANAGED",
    compute_resources=aws_batch.CfnComputeEnvironment.ComputeResourcesProperty(
        type=compenvtype,
        allocation_strategy="BEST_FIT_PROGRESSIVE",
        ec2_configuration=[aws_batch.CfnComputeEnvironment.Ec2ConfigurationObjectProperty(
            image_type='ECS_AL2',
        )],
        instance_role=self.instance_profile.attr_arn,
        instance_types=instancetype,
        launch_template=launch_template_props,
        maxv_cpus=maxcpu,
        minv_cpus=mincpu,
        subnets=list_subnets,
        security_group_ids=[self.default_sg.security_group_id]
    ),
    state="ENABLED"
)

With the completion of this step, the compute environment creates EC2 instances that install the Mountpoint for Amazon S3 client during launch and mount the S3 bucket DOC-EXAMPLE-BUCKET to the path /mnt. After creating the compute environments, we must pass the mounted folder on the underlying host to Docker containers on AWS Batch. For this we must add a volume configuration to the AWS Batch job definition, as shown in the following figure:

This image shows how to add a volume configuration to the AWS Batch job definition through AWS Console. After this, the containerized applications can access the mounted Amazon S3 bucket’s content in /mnt.

After this, the containerized applications can access the mounted S3 bucket’s content in /mnt.

Amazon EKS integration as ADDF module

ADDF is a ready-to-use, open source framework for the ADAS/AV workload that offers pre-built sample data, centralized data storage, data processing pipelines, visualization mechanisms, search interface, simulation workload, analytics interfaces, and prebuilt dashboards. With ADDF’s Amazon EKS module, users can deploy a fully managed, configurable EKS cluster with pre-installed standard add-ons/plugins. This module avoids the undifferentiated heavy lifting of deploying Amazon EKS infrastructure, identifying and installing standard plugins, and enables users to run their containerized workloads at scale on the ingested data. This ADDF module ─ “simulations/batch-managed” provides high-level workflow orchestration using Amazon MWAA to delegate compute intensive simulation tasks to dedicated services optimized for scalable parallel processing to AWS Batch. Another ADDF module ─ “simulation/k8s-managed” – provides the same workflow orchestration using Amazon MWAA and Amazon EKS.

In a few months, ADDF extends the module “simulation/k8s-managed” with additional Helm Charts to deploy a container image from Amazon ECR with pre-installed Mountpoint for Amazon S3, similar to the previous section Variant 1. The container downloads a file from Amazon S3, performs sample processing, and writes the results back into Amazon S3.

Conclusion

In this post, by using Mountpoint for Amazon S3, we improved simulation performance by 20% and made the architecture 30% more cost efficient. These improvements were made possible by directly interfacing with Amazon S3 using Mountpoint for Amazon S3. This new approach removed the need for high-level orchestration for data movement and management. Additionally, with the high throughput performance of Mountpoint for Amazon S3, Continental could switch to general purpose Amazon EC2 instance families, making the architecture more cost efficient.

To improve performance, lower cost for your workloads, and get started with Mountpoint for Amazon S3, visit the documentation page.

AWS Storage Blog