How Schaeffler uses generative AI to accelerate automotive software testing

Introduction

The automotive industry continues to face challenges from the mounting complexity of software systems, with requirements encompassing functional needs such as electrification and autonomous driving alongside non-functional criteria such as performance and security. In our previous blog post, we explored how the framework of the Virtual Engineering Workbench (VEW), powered by AWS services and generative artificial intelligence (AI), helps transform automotive software development and testing by helping enable virtualization, improve developer efficiency, enhance collaboration, and automate test case creation.

Building on that foundation, this blog post describes how the VEW concept is being put into practice to help accelerate automotive software testing with Schaeffler, a global leader in motion technology and automotive innovation. AWS and Schaeffler are excited to showcase how their advancements in virtual testing are being applied in real-world scenarios. Schaeffler’s commitment to digitalization and AI-driven solutions aligns with the AWS mission of empowering organizations through cutting-edge cloud technologies. Together, we explore how Schaeffler uses the VEW and generative AI to help streamline its software development processes, enhance product quality, and accelerate innovation in the automotive sector. Schaeffler’s collaboration with AWS highlights the benefits of integrating AI into complex engineering workflows and shows how the stage has been set for further advancements in automotive software engineering.

As described in this blog post, Schaeffler uses VEW to support innovation projects. Following the next steps described in the conclusion of that blog, we now provide an update on how Schaeffler has enhanced and expanded its use of the VEW. The enhancements include support for architectures based on the AUTOSAR Classic Platform, related virtual electronic control units, and bespoke AI-based workflows for additional software engineering use cases.

In a joint study by the AWS and Schaeffler teams to identify the most promising AI-supported workflows, we identified several generative AI use cases, as depicted in the following figure:

Figure 1. Use cases

Use case 1 focuses on text-heavy requirement translation and explanation.
Use case 2 defines rolling out coding assistance for embedded developers.
Use case 3 focuses test case generation based on requirements.
Use case 4 applies AI and a process overview to generate a concept we refer to as “intelligent process guidance,” which connects all development steps and provides developers an end-to-end view.

AWS and Schaeffler determined that, out of all the identified use cases, use cases 1 and 3 had the highest potential for cost saving and were the best candidates for initial exploration. Today’s automobiles have hundreds of thousands of requirements, and even a single complex electronic control unit, may have tens of thousands. According to Schaeffler, the number of test cases for validating requirements is typically three to five times higher than the baseline, as multiple test cases are needed to test all scenarios for a given requirement.

In the specific case of Schaeffler, we discovered the following reality:

“Today, testers at Schaeffler face hundreds of thousands of requirements. Those requirements are often in formats that are not optimized for human readability. Overall, this consumes several hours every week for testers.”

—Test Manager at Schaeffler

Current process and challenges

The testing of automotive software is governed by industry standards like ISO 29148 (requirements engineering) and ISO 29119 (software testing). According to those standards, test case generation for automotive software must follow four steps. To understand each of those steps in turn, let’s consider the example of an automotive adaptive cruise control (ACC) system, which automatically adjusts a vehicle’s speed to help maintain safe distances from vehicles ahead:

1. Reviewing requirements – At the start of the process, the functional and non-functional requirements of the software are clearly outlined in accordance with industry standards such as ISO 29148.

a. In the case of an ACC, ISO 21948 provides is that “the ACC system shall maintain a minimum following distance of 2 seconds from the vehicle ahead and adjust speed accordingly.” The test engineer should verify whether the requirement is clear, testable, and unambiguous and identify edge cases (such as instances of sudden braking or encountering stationary objects) and check compliance with ISO 29148.

2. Structuring features and feature sets – Based on the requirements, relevant features and their groupings are determined so as to create a logical structure for testing.

a. In the case of the ACC, logical feature sets and features could be as follows:

i. Speed control (feature set):

1. Maintain set speed when no vehicle is ahead.

2. Decelerate when a slower vehicle is detected.

3. Return to set speed once the lane is clear.

ii. Distance management (feature set):

1. Maintain a following distance of at least 2 seconds.

2. Adjust the following distance based on traffic and road conditions.

iii. Emergency handling (feature set):

1. Apply automatic braking if a vehicle ahead suddenly slows down.

2. Disable the ACC when the driver applies brakes or accelerates manually.

3. Identifying test conditions – Specific conditions under which each feature should be tested are identified, covering normal operation, edge cases, and failure scenarios.

a. In the case of the ACC, a test engineer might define the following test conditions based on features that need to be tested:

i. Normal following distance maintenance: Given that the vehicle is moving at 100 km/h, when another car appears ahead at 90 km/h, the ACC should reduce speed to maintain a 2-second gap.

ii. Sudden deceleration of lead vehicle: Given that the ACC is engaged, when the lead vehicle suddenly decelerates from 100 km/h to 50 km/h, the ACC should apply appropriate braking to maintain a safe distance.

iii. Cut-in vehicle scenario: Given that the ACC is engaged, when another vehicle cuts in abruptly, the system should detect it and adjust the speed accordingly.

4. Deriving test cases – Finally, detailed test cases are created based on the defined conditions, specifying inputs, expected outputs, and execution steps. Steps 2, 3, and 4 must comply with standards defined in ISO 29119:

a. In the case of the ACC, a test case that specifies inputs, expected outputs, and pass/fail criteria based on test conditions could be as follows:

i. Test case ID: ACC_002

ii. Description: Test ACC response to sudden lead vehicle deceleration

iii. Test steps:

1. Set the ACC to 100 km/h.

2. Simulate a lead vehicle reducing speed to 50 km/h.

3. Observe system behavior.

iv. Expected outcome: The ACC should detect deceleration, gradually apply braking, and maintain a safe following distance.

According to data supplied by Schaeffler, an experienced test engineer spends 820 hours preparing test cases for 837 system requirements (~1.02 hours per test case). When multiplied by more than 100,000 requirements of a software-defined vehicle (SDV), the overall effort becomes hugely significant—reaching over 100,000 hours. The process is therefore highly manual, time-consuming, and prone to human error, as it relies heavily on domain experts to analyze requirements, identify conditions, and derive test cases. Another challenge is mapping test cases back to requirements. Misidentifying a test case can lead to a requirement not being validated correctly.

The generative AI platform

In this section, we introduce the generative AI platform AWS and Schaeffler developed and used to help with test case generation in the automotive software use case. Schaeffler wanted the option to access multiple state-of-the-art large language models (LLMs) while maintaining responsible use. To achieve this, AWS and Schaeffler used a platform that helps facilitate organization-wide generative AI use by providing authenticated access to a large number of models through Amazon Bedrock—a fully managed service that offers customers a choice of foundation models from leading AI companies—alongside third-party models, while helping monitor cost and usage. The main component of the platform is a generative AI gateway built on an open-source framework called LiteLLM. It is hosted on AWS and helps users access multiple LLMs through Application Programming Interface (API) keys. The platform generates API keys at use-case level and tags usage with various metadata, such as project and cost-center identifiers. Moreover, workloads can be managed by setting rate and budget limits by API key. The gateway also helps load-balance traffic across multiple models in case of timeouts or failures. The test-generation use case uses models from the gateway via the generative AI platform’s user interface. We then used the API key generated by the platform to invoke the model of our choice with our payload.

The code snippet below shows how AWS and Schaeffler used Anthropic’s Claude with the gateway’s chat endpoint to help generate features and feature sets, test conditions, and test cases while tracking cost and usage. The platform helps route these requests to workload accounts where the models are invoked to help manage model throughput limits and quotas. The platform also comes with additional controls such as requiring approval for API keys that are to be used in production. While the generation of API keys is a core capability of the platform, it can also use infrastructure-as-code templates to help deploy multiple generative AI blueprints such as test-generation for the automotive software use case.

Figure 2. Generative platform user interface showing details of the deployed API key and metadata identifiers for the test-generation use case

The code for invoking the AI gateway is as follows (headers contain the API key):

response = requests.post(self.__api_url.geturl(), headers=headers, json=request_body, stream=False, timeout=300)

Figure 3. Screenshot of cost and usage

If you would like to build a gateway of your own, you can do so by following the AWS Solution Guidance for Multi-Provider Generative AI Gateway on AWS.

Figure 4. Generative AI proxy on AWS solutions architecture

Solution overview and architecture

Having simple and secure access to a generative AI chatbot helps empower Schaeffler employees and others in the field to uncover previously unknown uses for generative AI, which helps them optimize their work. That is also true for the test-generation process. However, once there is a clear use case and employees start using it at scale and regularly, asking the chatbot in natural language to load, process, and produce data in specific outputs can be cumbersome and prone to errors. For example, in the aforementioned test case, users have to convert the input from a Microsoft Excel spreadsheet to JSON, upload the JSON to the chatbot and write an arbitrary prompt for each of the process steps. Each output stage from the chatbot then needs to be processed individually to check for errors and fix them. This usually means that the output, which is also in JSON format, needs to be copied to an integrated development environment, processed, and copied back to the chatbot. Lastly, there is no simple way to use a chatbot directly to pause the process and continue at a later time.

Figure 5. Workflow between AI and test developer

To overcome those limitations and improve the user experience, AWS and Schaeffler built a new feature for the VEW to help with test case generation. The feature relies on the same LLM as the chatbot that is available to Schaeffler employees.

Figure 6. Test case generation sessions screenshot

The VEW now preserves history and state for all test generation sessions. Users can return to, review, and continue the session at any time. Users can also collaborate with their colleagues to process a single session.

Figure 7. Test case generation session screenshot

Each session helps walk the user through the test case generation steps. At each step, users can review, update, and save the data that was generated by the AI model.

Below is an architecture diagram of the new test case generation feature:

Figure 8. The solution architecture

At each step, the VEW web application uses representation state transfer (REST) API endpoints in VEW backend to pull or push data in JSON format. The web application converts AI-generated JSON files to table format for easier review and editing, and then back to JSON when users save data or progress in the session.
The REST API AWS Lambda function handler stores the uploaded JSON data in an Amazon Simple Storage Service (Amazon S3) bucket and saves the test case generation session metadata in Amazon DynamoDB.
The REST API AWS Lambda function publishes a domain event to an Amazon EventBridge event bus if the user uploads or modifies the JSON data. Here, the API gateway will return a successful response and a pending session state to show user the progress.
The asynchronous AWS Lambda function event handler reads the metadata from Amazon DynamoDB. It also loads a prompt template that corresponds to the current test-generation session state and the uploaded JSON data from an Amazon S3 bucket.
An AWS Lambda function invokes the AI gateway with a suitable prompt, which is composed from a pre-defined prompt template and test-generation session data.
The VEW web application loads the test-generation session data and displays a screen with controls to review and update the output from the AI gateway.

Business outcomes

The implemented solution accelerated test case generation speed by up to 60 percent. An experienced test engineer can now spend 265 hours instead of 820 hours preparing test cases for 837 system requirements (Reduced from ~1.02 hours per test case to ~0.32 hours per test case). Another benefit of the solution is that, because it assists the user in a more structured way, helping with an assortment of tasks from reviewing requirements to generating final test cases, it is much less likely to introduce human error. This helps reduce the likelihood of system requirements remaining unvalidated.

Figure 9. Effort saved using AI assistance

As a next step, AWS and Schaeffler would like to integrate the VEW with the testing software used by Schaeffler to automate the export of requirements and the import of test cases, helping further enhance the workflow. Afterward, AWS and Schaeffler aim to improve end-to-end efficiency by introducing a generative AI test cockpit that supports Schaeffler test engineers at every step, from helping handle requirements to creating test scripts and helping validate each test case.

AWS for Industries

How Schaeffler uses generative AI to accelerate automotive software testing

Introduction

Current process and challenges

The generative AI platform

Solution overview and architecture

Business outcomes

Resources

Follow

Learn

Resources

Developers

Help