AWS for Industries

Develop and deploy a customized workflow using Autonomous Driving Data Framework (ADDF) on AWS

Autonomous vehicles (AV) must be driven hundreds of millions of miles – and sometimes hundreds of billions of miles – to demonstrate their reliability in terms of fatalities and injuries. There is a need for alternative methods to supplement real-world testing including virtual testing and simulations, mathematic modeling and analysis, and scenario and behavior testing. In this AWS re:Invent session, we learned how BMW Group collects over 1 billion kilometers of anonymized perception data from its worldwide connected fleet of customer vehicles to develop safe and capable automated driving systems.

To support our automotive customers with addressing these pain points, we first created a reference architecture for advanced driver-assistance systems (ADAS) data lake, described by this AWS Architecture blog post. We also developed a blog series with GitHub repositories to cover the key aspects:

Autonomous Driving Data Framework (ADDF) now industrializes the reference solution and offers pre-built sample data, centralized data storage, data processing pipelines, visualization mechanisms, search interface, simulation workload, analytics interfaces, and prebuilt dashboards. The goal is to process and provide searchable, high accuracy, labeled scenario-based data for downstream workloads including model training, synthetic data generation, and simulation.

 Use Case and Solution Overview

The first release of ADDF covers the following four use cases (Figure 1):

  1. Scene Detection and Search: After data ingestion, metadata will be extracted from each ingested file and a scene detection pipeline determines scenes of interest, such as person-in-lane scenarios. The detected scene metadata is stored in Amazon DynamoDB and made available through Amazon OpenSearch Service, which enables users to find and locate relevant input data in the data lake based on scene metadata.
  2. Data Visualization: ADDF provides a Webviz-based data visualization module that can stream ROS bag files from the data lake and visualize them in a browser. The visualization module supports streaming of specific scenes detected in the previous step and enables users to verify or debug ROS bag files.
  3. Simulation: With ADDF, users can run their containerized workloads at scale on the ingested data. A simulation module provides high-level workflow orchestration using Amazon Managed Workflows for Apache Airflow (Amazon MWAA), which delegates compute intensive simulation tasks to dedicated services optimized for scalable parallel processing like AWS Batch or the managed Kubernetes service, Amazon Elastic Kubernetes Service (Amazon EKS).
  4. Develop and deploy: Bootstrapping, development, and deployment of modules 1–3 are enabled through the use of AWS open-source projects CodeSeeder and SeedFarmer. CodeSeeder utilizes AWS CodeBuild to remotely deploy individual modules. This enables modules to be developed using common infrastructure as code and deployment mechanisms like AWS Cloud Development Kit (AWS CDK), AWS CloudFormation, Terraform, and others. SeedFarmer utilizes declarative manifests to define an ADDF deployment and orchestrates module deployment, destruction, change detection, and state management. SeedFarmer enables automated GitOps management of ADDF deployments.

Figure 1: ADDF use cases

This solution architecture (Figure 2) has six key components:

  1. User interface for code development (AWS Cloud9), KPI reporting (Amazon QuickSight), web application for scenario search and visualization, deployment tool (SeedFarmer CLI) and modeling (Jupyter Notebook).
  2. Three pre-built workflows include Scene Detection and Search, Rosbag file virtualization, Simulation with EKS. Three additional workflows are on the roadmap: Model Training, Automatic Labeling, and KPI calculation.
  3. Orchestration service is Amazon MWAA with flexible compute backend (AWS Batch, Amazon EKS and Amazon EMR).
  4. Metadata storage includes AWS Glue Data Catalog for drive data, Amazon Neptune for file and data lineage, Amazon DynamoDB for drive metadata, and Amazon OpenSearch Service for OpenScenario Search.
  5. Amazon Simple Storage Service (Amazon S3) is the data storage for the raw data and Amazon Redshift is the data storage for numeric sensor data.
  6. CI/CD automation leverages AWS CDK, AWS CodeBuild, and AWS CodeSeeder.

Figure 2: ADDF solution overview

Deploying a Non-production ADDF Environment with Demo Notebook

Prerequisites

To simplify deployment and reduce the number of dependencies and prerequisites, the ADDF makes use of two AWS open-source projects: CodeSeeder to enable remote execution of Python code in AWS CodeBuild and SeedFarmer to orchestrate the deployment of ADDF modules by CodeSeeder.

By utilizing CodeSeeder and SeedFarmer, we are able to reduce the local prerequisites to:

Deployment

Step 1: Clone the ADDF repository from GitHub. We recommend checking out the most recent, official release branch. Since the ADDF is intended to be managed by automated CI/CD processes tied to customer’s own git repositories, we also suggest setting our remote GitHub repository as the upstream remote.

git clone --origin upstream --branch
release/0.1.0 https://github.com/awslabs/autonomous-driving-data-framework

Step 2: Create a Python Virtual Environment and install the dependencies.

cd autonomous-driving-data-framework
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

Ensure you have valid AWS credentials, set a default Region, and bootstrap the CDK. Bootstrapping of the CDK is only required if the AWS CDK has not previously been used in the Account/Region.

export AWS_DEFAULT_REGION=<<REGION>>
cdk bootstrap aws://<<ACCOUNT_NUMBER>>/<<REGION>>

You should be replacing the ACCOUNT_NUMBER and REGION in the above before you bootstrap.

Step 3:  Set up the secrets in the AWS Secrets Manager to be used by ADDF; these are for the following:

  • JupyterHub
  • VS Code
  • OpenSearch Proxy
  • Docker Hub (optional)

We provide scripts for easy setup (the first script uses the jq tool):

source ./scripts/setup-secrets-example.sh  # This sets up the first three credentials.
./scripts/setup-secrets-dockerhub.sh  # This will prompt for username and password.

Step 4: Select the modules and start the deployment. ADDF consists of different modules and you can select which modules to turn on in so-called manifest. We provide a few sample manifests in ./manifests folder. It is always recommended to create a copy of environment-specific manifests using the sample manifests.

cp -R manifests/example-dev manifests/demo
sed -i .bak "s/example-dev/demo/g" manifests/demo/deployment.yaml

We will use the demo folder for the walkthrough, which has the deployment.yaml manifest as the driver for deploying modules. Adapting the manifest, you can select which modules you want to include in your ADDF deployment. For more detailed configuration options, we refer to the documentation in the repository. For the sake of simplicity, we use the provided manifest and deploy it:

seedfarmer apply ./manifests/demo/deployment.yaml

You can pass an optional --debug flag to the above command for getting debug level output.

You can pass an optional --dry-run flag to the above command for understanding/previewing the action plan that happens when we deploy the modules

When the deployment is successful as a result of the previous command’s execution, you will see the ADDF modules deployed as shown in Figure 3.

Figure 3: Overview of deployed modules

Demo Use Case

Log into JupyterHub

After the successful deployment, you can access JupyterHub module deployed on Amazon EKS and follow the below instructions to access the JupyterHub dashboard:

JupyterHub is recommended only for demo-related workloads and it is not recommended for production grade interaction with data/services involving production data. We recommend that you leverage Amazon EMR Studio for any non-demo related workloads.

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
  2. On the navigation bar, choose the Region you have deployed ADDF to retrieve the DNS of the load balancer created for JupyterHub.
  3. Select the load balancer, which starts with k8s-jupyter as shown below:
    k8s-jupyterh-jupyterh-XXXXXXXXX.us-west-2.elb.amazonaws.com
  4. Copy the DNS name of the load balancer and append jupyter to it, and below is how it should look like:
k8s-jupyterh-jupyterh-XXXXXXXXX.us-west-2.elb.amazonaws.com/jupyter
  1. Then you will be prompted to enter the JupyterHub username and password that was initially created using the helper script setup-secrets-example.sh and stored in AWS Secrets Manager. They can be retrieved from the AWS Secrets Manager console, then select jh-credentials from the search bar and retrieve the credentials.
  2. Once you have logged into the JupyterHub environment, you can create a sample notebook.

Run the Scene-Detection Pipeline

  1. You can download the two publicly available sample ROS bag files (file 1 and file 2) and copy them into your raw-bucket that would have been deployed by the datalake-buckets module. You can identify the bucket name using the naming pattern addf-demo-raw-bucket-<<hash>> and copy them to a prefix `rosbag-scene-detection`
  2. Once the above files are uploaded, the scene-detection module driven by AWS Step Functions will be triggered and you should be expecting the module-specific DynamoDB tables, namely Rosbag-BagFile-Metadata and Rosbag-Scene-Metadata, populated with the results. Following is the screenshot (Figure 4) of a successful run of scene-detection pipeline.

    Figure 4: Scene Detection Pipeline with AWS Step Functions

  3.  As a prerequisite of running the below commands inside the JupyterHub notebook, you should install a few binaries as below:
    1. pip install boto3 pandas awscli
  4. You can execute the below to query the OpenSearch domain and get the list of tables:
    wget -qO- \
    --no-check-certificate \
     https://vpc-addl-example-dev-core-opens-
    <<XXXXXXXX>>.<<region>>.es.amazonaws.com
    /_cat/indices?h=index
    

    You should be replacing the OpenSearch domain and the table name in the above command with the physical IDs deployed in your account and Region.

  5. Select the table (copy the table name) that starts with rosbag-metadata-scene-search-<<date>> and construct the below query:
    wget -qO- \
    --output-document=query_results.json \
    --no-check-certificate \
    "https://vpc-addl-example-dev-core-opens-
    <<XXXXXXXX>>.<<region>>.es.amazonaws.com/rosbag-metadata-
    scene-search--<<date>>/_search?pretty=true”
    

    You should be replacing the OpenSearch domain and the table name in the above command with the physical IDs deployed in your account and Region.

This use case provides an end-to-end scene detection pipeline for ROS bag files, ingesting the ROS bag files from S3, and transforming the topic data into parquet format to perform scene detection in PySpark on Amazon EMR. This then exposes scene descriptions via DynamoDB to Amazon OpenSearch.

Visualize the ROS bag file

You can query a private REST API powered by Amazon API Gateway to generate the Webviz endpoint and then construct the final endpoint by appending the query string parameters scene-id and record-id to get the signed URL.

wget -qO- \
"https://XXXXXX.execute-api.<<region>>.amazonaws.com/get-
url?scene_id=small2__2020-11-19-16-21-
22_4_PersonInLane_1.6058245149E9&record_id=small2__2020-11-19-
16-21-22_4"

You should be replacing the Private REST API link in the above command with the physical ID deployed in your account and Region.

Then, copy the value of the url key from the response body and open it in a Google Chrome browser (preferred). Once the URL is loaded, custom layouts for Webviz can be imported through JSON configs. This custom layout contains the topic configurations and window layouts specific to our ROS bag format and should be modified according to your ROS bag topics. Follow the below:

  1. Select Config → Import/Export Layout
  2. Copy and paste the contents of layout.json from modules/visualization/layout.json into the pop-up window and play the content.

You should be seeing a sample detected scene output as in Figure 5:

Figure 5: Example visualization of ROS bag file streamed from Amazon S3

Customize the Pipelines

The code base is segmented for management. A deployment is comprised of groups, which are made up of modules. The modules house the code whereas the groups and deployment are for logical separation. Modules can have dependencies on other modules (for example, a networking module can be reused by other modules) that belong to different groups, but module(s) declared as a part of a group cannot declare their dependencies within the same group. The groups are deployed in a specified order and are destroyed in the reverse of that order to natively handle dependencies. When deciding the list of the modules and their order, the ordering is critical (for example, you should deploy your networking module before you deploy compute resources that require a VPC). Each module can define input parameters for customization and output parameters to be leveraged by other modules, which gives the modules of the code base a level of abstraction from one another. In other words, we can modify a deployed module (for example, by adding new functionality or change an input parameter) without impacting other modules in the same deployment, where the SeedFarmer CLI will detect the changeset and execute it accordingly without impacting the dependent modules.

The manifests determine the inputs for each module in the project. There is a primary manifest (deployment.yaml in Figure 6) that drives the name of the deployment (also known as the project), the groups and their order in the deployment, and where to find the manifests for each group.

Figure 6: Primary manifest file

Each group manifest defines the modules, where the code is located, and their input parameters in key-value format (this is where you can also reference a module’s output from another group).

Figure 7: Manifest file of a given module

Now that you have a basic understanding, let’s explore how this is an advantage to us. The module rosbag-scene-detection is defined in the manifest manifests/demo/rosbag-modules.yaml (Figure 7) and the code is located at modules/analysis/rosbag-scene-detection. If you want to modify the code, such as adding a new step for data processing after the module has been deployed, you can add changes to the code and save them. For the changes to take effect and be deployed, you would need to rerun the deployment from a terminal:

seedfarmer apply ./manifests/demo/deployment.yaml

SeedFarmer will detect that there has been a change to the code base for that module, and redeploy that module. In the background, SeedFarmer is comparing the deployment.yaml content to what is already deployed and applying the changes (in the order specified in deployment.yaml). Since each module has a level of abstraction, only the modules that have changed will be redeployed, leaving the unchanged modules alone.

Cleanup

To destroy the modules for a given deployment demo, you can run the below command:

seedfarmer destroy demo

You should be replacing the string demo with the environment name you have set, if you have it customized. You can pass an optional --debug flag to the above command for getting debug level output.

Outlook and Conclusion

ADDF is a ready-to-use, open-source framework for ADAS workload. First, we described its architecture and what use cases it covers. Second, we showed how to deploy ADDF from scratch to get started quickly. Third, we described how to customize the scene detection pipeline to incorporate individual needs.

The test strategy blueprint provided by the ASAM Test Specification study group defined the test methods and use cases to validate AV and driving functions safely and reliably. ADDF covers the test methods including scenario-based test and fault injection. We are committed to extend and further develop ADDF to meet the demands of our customers—more workflows, multi-tenancy, and synthetic scene generation to name a few.

OEMs, Tier-N suppliers, and startups can benefit from this open-source solution. We are strong believers in open-source and value the feedback and contribution from the community.

References

Junjie Tang

Junjie Tang

As a Principal Consultant at AWS Professional Services, Junjie leads the design and data-driven solutions for global clients, brining over 10 years of experience in cloud computing, big data, AI, and IoT. Junjie heads the Autonomous Driving Data Framework (ADDF) open-source project designed to enable scalable data processing, model training, and simulation for automated driving systems. Junjie is passionate about creating innovative solutions that improve quality of life and the environment.

Chauncy McCaughey

Chauncy McCaughey

Chauncy McCaughey is a Principal Data Architect at AWS Professional Services. He has a background in Software Engineering, Data & Analytics, and consulting. Chauncy is currently focused on development of complex Open-Source solutions for AWS Customers and the Data & Analytics community.

Derek Graeber

Derek Graeber

Derek is a Sr. Engineer focusing on Analytics and AI/ML workloads. As an engineer and architect, he focuses on designing, developing and delivering opensource software solutions for customers - and dabbles in robotics.

Hendrik Schoeneberg

Hendrik Schoeneberg

Hendrik is a Principal Data Architect at AWS ProServe and helps customers with ADAS/AV platforms, large-scale simulation frameworks and virtual engineering workbenches. He is passionate about Big Data and Data Analytics and loves his job for its challenges and the opportunity to work with inspiring customers and colleagues.

Srinivas Reddy Cheruku

Srinivas Reddy Cheruku

Srinivas Reddy is a Senior DevOps Consultant working with Products & Solutions team at AWS ProServe, building open source data analytics solutions with specialization in Kubernetes. He is currently focusing on building the core platform which enables the automotive customers to run their automated driving systems. He loves to travel during his time off.

Tae Won Ha

Tae Won Ha

Tae Won Ha is an Engagement Manager at AWS with a software developer background. He leads AWS ProServe teams in multiple engagements. He has been helping customers delivering tailored solutions on AWS to achieve their business goals. In his free time, Tae Won is an active open-source developer.