AWS for Industries
Accelerating HiL Testing for AV/ADAS with a Hybrid Cloud Approach – AWS and NetApp
Developing autonomous driving (AV) and advanced driver assistance systems (ADAS) requires automotive OEMs and Tier 1 suppliers to collect petabytes of real-world driving data. This data is typically uploaded by OEMs or Tier 1 suppliers to cloud providers such as AWS, where it is analyzed, used to train AI models, and used for Software in the Loop (SiL) testing and archival. A curated “golden dataset” is then created by OEMs or Tier 1 suppliers to help validate system behavior.
To help ensure correct function of their systems, in addition to SiL testing in the cloud, teams must validate those systems on the same hardware as in vehicle. This process, known as Hardware-in-the-Loop (HiL) testing, involves the system under test (SUT) – the target hardware running the software to be tested – and the HiL rig, which exposes the SUT to test inputs and measures and monitors its response at the hardware level. HiL testing requires transferring the golden dataset (with data sizes ranging from single TBs to multi PBs depending on the project) from the cloud to on-premises test rigs for playback. For regression testing, the same dataset – or modified versions of it – is repeatedly downloaded (e.g., daily to monthly) to continuously monitor the performance of the developing system. This can create performance bottlenecks, operational complexity, and high egress costs.
By transferring critical datasets one time from AWS to local storage (such as NetApp’s local storage solutions) and then reusing them across multiple simulation runs, users can potentially eliminate redundant downloads, reduce test cycle times and egress fees, and maintain the fidelity of their simulations. This approach can help save egress expenses by up to 75% annually while helping increase operational efficiency. NetApp storage offers high throughput and low latency with multi-protocol (e.g., NFS, SMB, S3-compatible) capabilities, making it well-suited for demanding HiL testing workloads in on-premises environments.
Introduction – Challenges in HiL Testing
In day-to-day operations, HiL testing requires the use of large sensor datasets to simulate real-world driving conditions. Traditionally, these datasets are repeatedly egressed from AWS to on-premises HiL rigs multiple times per month. Each data transfer can introduce operational overhead, increase test cycle times, and drive-up cloud egress costs. By implementing a more optimized process to help avoid redundant downloads, teams can potentially achieve significant time and cost savings and enhance overall operational efficiency.
NetApp Intelligent Infrastructure – Introduction
NetApp provides the enterprise storage foundation that helps power data-intensive workloads. NetApp’s on-premises storage appliances deliver high-performance file, block, and object storage with built-in data management services such as snapshots, replication, and automated tiering. This helps ensure that customers’ large simulation datasets are stored efficiently, accessed with low latency, and protected without adding operational overhead. Customers can access NetApp storage through a NetApp subscription (OpEx) or directly purchase it (CapEx). Using such a setup for HiL testing, engineers can more reliably run latency-sensitive simulations on-premises, while more seamlessly connecting to Amazon S3 for cloud elasticity, analytics, and long-term retention. NetApp’s hybrid cloud/on-premises architecture helps ensure that test data is in the right place, at the right cost, and available across the full development cycle.
The AWS and NetApp HiL Solution – Hybrid Approach
The solution adopts a hybrid cloud model that integrates Amazon S3 with NetApp’s robust and high-performance on-premises enterprise storage. The key elements of the hybrid approach are:
1. More Cost-Effective Data Management: Instead of transferring data for every simulation run, the relevant data is pushed from AWS to the local NetApp storage once. Once the datasets are stored locally, they are available for all subsequent simulation runs, helping eliminate the need for repeated egress charges.
2. Faster Test Cycles: With data locally available, teams can run back-to-back simulations without waiting for large transfers to complete, helping enable more frequent and reliable regression testing. If multiple engineers want to run simulations using the same datasets, they can use a NetApp ONTAP feature, called FlexClone, to create separate and more space-efficient clones of that dataset for every engineer. Doing so helps provide each engineer with traceability of their work. Because NetApp ONTAP is a pointer-based file system, these clones do not consume extra storage capacity.
Architecture and Data Flow
Figure 1: High-level Architecture Showing Data Flow
The architecture begins with a robust AV/ADAS data lake hosted on Amazon S3, chosen for its cost efficiency and high performance in managing large volumes of sensor data. Storing on Amazon S3 helps enable scaling of SiL testing and helps provide long-term archiving capabilities. Golden datasets for HiL testing are then pushed to an on-premises NetApp storage appliance, which can be deployed as either a multi-protocol NetApp ONTAP platform or as Amazon S3-compliant StorageGRID object storage, depending on specific customer requirements. Each option offers distinct advantages.
NetApp ONTAP delivers multi-protocol storage with mature data management features, snapshots and/or clones. StorageGRID provides Amazon S3-compliant object storage designed for massive scale, high density, high availability, and even geo-distribution of data. In both cases, the NetApp on-premises solution helps ensure reliable storage and the necessary throughput to feed data directly to HiL stations, helping streamline data availability and minimizing repetitive egress from the cloud.
Use Case Scenario
To illustrate, consider the following scenario as a testing environment:
Scene Size: 50GB, Number of Scenes: 1,000, Replay Count: 32, HiL Rigs: 8
Month 0: Establishing the Local Cache
Data Transfer: A one-time push of golden data for 8,000 simulation runs (1,000 scenes X 8 rigs) from AWS to the NetApp system, totaling roughly 400TB.
Outcome: This initial push creates a complete local cache of the data needed to run the tests.
Month 1: Incremental Updates
Data Transfer: Only 150TB of new or updated data is transferred from AWS.
Data Reuse: Approximately 250TB of data from Month 0 is retained, which continues to help support simulation runs.
Outcome: HiL rigs operate using a combination of new and previously cached data, helping reduce the need for full-scale transfers.
Month 2: Full Local Reuse
Data Transfer: No new egress is required as the local cache covers simulation needs.
Operation: All 8,000 scenes are run, all using the data already stored on-premises.
Outcome: Zero new egress.
Following this schedule throughout the year, this hybrid approach helps achieve egress cost reductions for HiL customers of up to 75% annually.
Flexible Consumption Models
NetApp’s deployment options help to further enhance the flexibility of this approach. Depending on the project’s purchasing strategy, customers can choose between:
- CapEx Model: Investing upfront in on-premises hardware; or
- OpEx Model: NetApp Keystone Pricing aligns costs with usage through a pay-as-you-go model.
This flexibility allows customers to tailor their infrastructure spend to their specific testing needs.
Conclusion
Taking advantage of a hybrid cloud/on-premises storage strategy helps customers support development, SiL testing at scale and archival on Amazon S3 while using on-premises storage for HiL testing.
By rethinking your data transfer strategy and using a hybrid cloud architecture, you can transform your HiL testing operations to become faster, simpler, and more cost-efficient. Transferring critical datasets from AWS to local NetApp storage once and then reusing that data across multiple simulation runs can help simplify your workflow and deliver significant cost savings. This approach offers a more practical and more cost-effective solution that helps keep focus on rigorous testing and rapid development cycles while helping reduce overall data and infrastructure costs.
We encourage you to reach out to us via email (sjc25-proto-lab-admins@amazon.com) if you want to see this system in action in our automotive Santa Clara lab in Silicon Valley.
