AWS HPC Blog

Accelerating agent-based simulation for autonomous driving

Driving innovation has always been at the core of AWS and one of the realms in which this innovation is most palpable is in the burgeoning industry of autonomous vehicles. By accelerating agent-based models (ABM) simulations with high performance computing, AWS is at the forefront of this technological revolution, and nowhere is this more evident than in their support of the CARLA RAI Challenge and their commitment to Responsible AI (RAI).

In this post, we’ll tell you about the challenge itself, the tools we’ve leveraged to set it up, and show you how you can participate.

The CARLA RAI challenge and autonomous driving

The CARLA Responsible AI Challenge is an initiative led by the Oxford Robotics Institute, building upon the previous CARLA Challenges by extending the Leaderboard with additional RAI related metrics while leveraging the CARLA open-source autonomous driving simulator.

We’re hoping to stimulate creative minds to build capable artificial intelligence (AI) agents with the ability to navigate diverse and complex driving scenarios. The challenge has a strong emphasis on the principles of Responsible AI to promote safety in AI systems through rigorous assessment of models under robustness, environmental sustainability, and transparency lenses.

The role of AI in autonomous driving has never been more crucial, and the CARLA RAI Challenge serves to highlight this fact. Participants are encouraged to push the limits of technology and create AI agents that can navigate in all possible driving conditions – not just ideal ones. From bustling city traffic to desolate country roads, from bright sunny days to foggy nights – the AI agent should be prepared to handle it all.

Fig 1: The main goal of the challenge remains the assessment of the driving proficiency of autonomous agents in realistic traffic situations, as defined in the leaderboard mechanics. Teams will have to complete 10 routes in 2 weather conditions through 5 repetitions.

Figure 1: The main goal of the challenge remains the assessment of the driving proficiency of autonomous agents in realistic traffic situations, as defined in the leaderboard mechanics. Teams will have to complete 10 routes in 2 weather conditions through 5 repetitions. Source: CARLA Challenge Simulator.

The competition goes beyond just AI innovation. By challenging participants to ensure their AI systems can adapt to varying traffic, weather, and hardware conditions, the competition ensures driving models’ robustness.

But Responsible AI is important, too. Participants are encouraged to develop transparent or explainable models to facilitate accountability. This allows for better understanding, and therefore, improvements in AI behavior.

Finally, the challenge promotes environmental sustainability by encouraging model development activities that cut down on carbon dioxide (CO2) emissions.

The timeline for the challenge is as follows:

  • The challenge will be opened on April 10th, 2024.
  • The closing date for the challenge is July 9th, 2024.
  • Teams are required to submit their video presentation by July 21st, 2024.

Role of AWS in agent-based simulations

Using AWS for agent-based simulations helps to provide a robust and scalable environment for testing – and refining – the AI agents participating in the challenge. AWS offers expansive computational resources via its cloud-based infrastructure.

AWS operates at a huge scale, making it possible to run the resource-intensive simulations needed for testing the AI agents. Each simulation is a high-fidelity virtual representation of various driving scenarios, involving many variables and needing a lot of processing power. These simulations are not just one-off events, but need to be run repeatedly as the AI agents are continuously tweaked and improved.

But it’s not just about resources: collaboration is critical for innovation. The cloud provides an environment where researchers, developers, and AI enthusiasts can come together to share ideas, learn from each other, and drive forward the development of autonomous driving technology.

This means that AWS is a collaborator and enabler in the world of autonomous driving, making significant contributions to the advancement of agent-based simulations and, by extension, the entire field of autonomous vehicles.

The power of agent-based models in autonomous driving

Agent-based models (ABM) are changing the world of autonomous driving. Their power lies in their ability to replicate real-world driving conditions at a local level, with remarkable accuracy and detail – the perfect environment for testing AI agents.

An agent-based model is a complex system where each entity (agent), operates independently according to a set of rules. In the context of autonomous driving, these agents might represent other vehicles, pedestrians, or even elements of the environment like traffic lights and road markings. By simulating the actions and interactions of these agents, ABMs  can recreate a wide range of driving scenarios – from rush hour in a busy city to a quiet country road at night.

Agent-based models are also flexible. We can easily modify them to include new types of agents, or adjust the behavior of existing ones, allowing researchers to simulate virtually any driving scenario imaginable. This adaptability is key for the development of AI systems trying to replicate what they might encounter on real roads.

But the true power of ABMs is that they can learn and evolve. As the agents navigate the virtual environment, they learn. Each run provides data that can be used to fine-tune the agent and optimize its algorithm. This iterative process of learning and adaptation is what makes agent-based models such a powerful tool for autonomous driving research.

How the CARLA RAI Challenge runs on AWS

Running multiple simulations simultaneously drastically reduces the time spent testing and refining the agents, clearly a boon for developers, who can iterate faster. That means we need to have a scalable mechanism to grow the resources when we need to do a lot of work.

Once a developer creates and containerizes their agents, they can deploy them on AWS. In Figure 2, you can tell that the first step involves using the EvalAI website to submit Docker containers to the Amazon Elastic Container Registry (ECR), makes it easy to store, share, and deploy container images.

In step 2, the user submits the simulation configuration data, which will be forwarded by EvalAI into an Amazon Simple Queue Service (Amazon SQS) queue, which then triggers an Amazon EventBridge event (step 3) that connects applications with data from diverse sources.

The event initiated by SQS leads to the execution of an AWS Step Functions pipeline (step 4), which carries out a series of tasks based on the established deployment graph. The Step Functions then execute step 5, where a Lambda function is used to stash user information into Amazon DynamoDB, which is a NoSQL database that offers excellent scalability.

Step 6 involves the actual simulation itself, which is done using the Amazon Elastic Kubernetes Service (Amazon EKS). EKS facilitates the scaling of containerized applications across a cluster. This elasticity allows for the automatic shutdown of all unused resources, too. All the operational logs and simulation results are logged into Amazon CloudWatch for analysis and diagnostics later.

Finally – at step 7 – another Lambda function stores the simulation results in an Amazon Simple Storage Service bucket (Amazon S3). This is an excellent solution for storing a lot of files. The last step also returns the results using a REST API to the user through the EvalAI web interface.

Figure 1 – The elastic cloud architecture for the CARLA RAI Challenge. Each stage is designed to scale, which means multiple simulations can run simultaneously. This drastically reduces the time spent on testing and refining the agents.

Figure 2 – The elastic cloud architecture powering the CARLA RAI Challenge. This architecture was introduced in the previous CARLA Challenge. Each stage is designed to scale, which means multiple simulations can run simultaneously. This drastically reduces the time spent on testing and refining the agents.

Challenge breakdown

Here’s a breakdown of the challenge.

  1. Getting Started with CARLA: Participants need to download and set up a specific version of CARLA (0.9.10.1) on their computers. This version is necessary because it matches the environment used in the online servers where the agents are assessed.
  2. Configuration and Setup: After installing CARLA, participants must adjust some configurations to link additional components specific for the challenge. This include the Leaderboard system the Scenario Runner, and Routes:
    1. Leaderboard: the control center for the challenge. It runs the autonomous agent created by participants through a series of tests across different routes and traffic conditions. It evaluates how well the agent performs in each scenario.
    2. Scenarios: These are predefined traffic situations that the autonomous agent must navigate through successfully. There are ten types of scenarios, each with different parameters. These simulate real-world traffic situations in the virtual towns available in CARLA.
    3. Routes: These are paths from one point to another that the agent must follow. Routes have start and end points and may include specific weather conditions they have to simulate. Participants can train their agents using the provided routes, but the routes used for final evaluation in the challenge are a secret.
  3. Testing the Agent: Before submitting, we encourage participants to test their agents using a basic setup. This test them manually control an agent through a simulation, to get clear understanding of what the agent will face during the challenge.
  4. Final Submission and Evaluation: Once they complete the program, we’ll ask participants to prepare their autonomous agent as a Docker image, which will undergo rigorous testing by the Leaderboard on undisclosed routes and scenarios. It’s crucial that the agent is able to successfully navigate these routes while following traffic laws and handling a variety of traffic situations. We’ll evaluate the performance of the agent in these tasks, focusing on how the agents cope in degraded driving and traffic conditions. Special metrics in the challenge include things like robustness against harsh environmental situations, data drift, sensor failures, and environmental impacts of running the agents.

What’s next? AWS and autonomous driving

As the landscape of autonomous driving continues to advance, we see AWS continuing to advance. Their ongoing dedication to delivering robust, high performance computing solutions, and their support for forward-thinking initiatives like the CARLA RAI Challenge is a promising combination.

That’s because rapid progress in autonomous driving demands powerful, scalable, and reliable computational resources – definitely the cloud’s wheelhouse.

Beyond this, the contribution that AWS makes to promoting a culture of collaboration and innovation within the AI and autonomous driving communities is an important driver. The cloud is important for researchers, developers, and AI enthusiasts to exchange ideas and learn from each other. And all of this collectively pushes the boundaries of what’s possible.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this blog.

Ilan Gleiser

Ilan Gleiser

Ilan Gleiser is a Principal Emerging Technologies Specialist at AWS WWSO Advanced Computing team focusing on Circular Economy, Agent-Based Simulation and Climate Risk. He is an Expert Advisor of Digital Technologies for Circular Economy with United Nations Environmental Programme. Ilan’s background is in Quant Finance and Machine Learning.

Lars Kunze

Lars Kunze

Lars Kunze is a Departmental Lecturer in Robotics in the Oxford Robotics Institute (ORI) and the Department of Engineering Science at the University of Oxford. In the ORI, he leads the Cognitive Robotics Group (CRG). Lars is also the Technical Lead at the Responsible Technology Institute (RTI), an international centre of excellence focused on responsible technology at Oxford University; and a Programme Fellow of the Assuring Autonomy International Programme (AAIP).

Daniel Omeiza

Daniel Omeiza

Daniel Omeiza is a postdoctoral researcher at the Oxford Robotics Institute (ORI) where he researches explainability in robotic systems. He is also developing new quantitative metrics for assessing the overall health of autonomous driving agents. He spent the past 5 years as an active machine learning and explainable AI researcher across organisations such as the University of Oxford, Carnegie Mellon University, and IBM Research.

Ross Pivovar

Ross Pivovar

Ross has over 15 years of experience in a combination of numerical and statistical method development for both physics simulations and machine learning. Ross is a Senior Solutions Architect at AWS focusing on development of self-learning digital twins, multi-agent simulations, and physics ML surrogate modeling.