AWS for Industries
How BMW Group Enhances Cloud Optimization With Generative AI on AWS
The BMW Group, headquartered in Munich, Germany, employs 159,000 people across more than 30 production and assembly facilities in 15 countries. As an automotive innovation leader, BMW Group has been at the forefront of integrating Generative AI (GenAI) technology. Its GenAI applications already streamline daily processes and boost the productivity of thousands of employees.
The BMW Group manages a vast cloud environment, growing from 4,500 AWS accounts in 2023 to over 10,000 in 2025. To ensure optimal cloud usage and cost efficiency, BMW Group’s Financial Operations Center of Excellence (FinOps CoE) developed Cloud Efficiency Analytics (CLEA) – an expert tool providing a centralized view of AWS consumption for informed cost optimization.
To further enhance its cloud efficiency, BMW Group partnered with AWS Professional Services to develop CLEAI—a Generative AI (GenAI)-powered assistant integrated into the CLEA expert tool. This solution enables users of all AWS skill levels to foster cost optimization of their cloud workflows.
Today, CLEAI supports more than five thousand BMW Group users in their cloud optimization efforts, accelerating analysis and implementation of cloud infrastructure optimization by up to 50%. Through natural language conversation, CLEAI provides actionable insights, identifies cost savings opportunities, and generates step-by-step implementation guidance and underlying code (CLI commands, Python scripts, Terraform) following BMW Group and AWS best practices.
This blog post provides (1) an overview of common use cases for CLEAI, demonstrating how this GenAI-powered assistant enhances cloud FinOps at BMW Group, and (2) a technical deep dive on CLEAI implementation. The solution is industry agnostic and applicable to any organization using AWS.
CLEAI in Action
1. Cloud Cost and Metadata Analysis
Navigating complex cloud environments to understand costs and resource usage can be a significant bottleneck. CLEAI transforms this experience by enabling users to retrieve granular cost information and metadata from AWS and other cloud providers through intuitive natural language conversations. This eliminates tedious manual navigation through dashboards, empowering users to quickly uncover critical insights.
Semantic search
When a cloud application spans multiple cloud accounts or development environments, understanding its holistic cost requires accurately identifying all relevant application environments. For instance, a user can ask CLEAI to “List all AWS accounts that belong to Cloud Efficiency Analytics application that start with name ‘CLEA’,” and the assistant intelligently identifies and presents the relevant accounts.
Figure 1. Example: semantic search in CLEAI
Pinpointing Cost Drivers
One of the major FinOps use cases is identifying which services are driving up cloud consumption costs. With CLEAI, a user can simply ask, “Give me the top 3 services in terms of cost for account clea prod, clea dev and clea int for the last 3 months per environment.” CLEAI then retrieves and processes data from multiple accounts related to the application, providing a clear breakdown of high-cost services for each environment. The system leverages CLEA’s APIs (discussed in the technical deep dive section) to access this information, while maintaining access controls based on the user’s permissions, ensuring data security and compliance.
Figure 2. Example: identification of the highest cost drivers in specified accounts
Deep dive into service usage
Once major cost drivers are identified, users often need to perform a deeper analysis of specific service usage over time. For example, a user can request CLEAI to “provide a cost analysis for Amazon Athena spend in CLEA Prod account over the last 6 months with monthly granularity,” receiving a detailed breakdown of service consumption patterns.
Figure 3. Example: cost analysis over a specific time period
Powerful comparative analysis
For central FinOps teams, CLEAI facilitates robust comparative spend analysis across various organizational units, applications, and cloud environments over custom time periods. The below example shows CLEAI responding to “Compare the costs and used services for department fg-xx versus department fg-yy for month may 2025,” providing a comprehensive breakdown that highlights differences in service usage and potential optimization opportunities.
Figure 4. Example: cloud consumption and cost comparison analysis
2. Optimization: Cloud Best Practices and Recommendations
Beyond analysis, CLEAI actively guides users toward optimizing their cloud infrastructure. It provides actionable recommendations based on a rich knowledge base that combines official AWS and other cloud providers’ technical documentation. Additionally, BMW Group’s proprietary FinOps knowledge base provides guidelines and best practices related to operating cloud services, including resource usage optimization strategies, budget management, and reporting standards tailored to BMW Group’s financial operations in cloud environments.
Internal best practice recommendations
The below example demonstrates how CLEAI can provide specific BMW Group examples and recommendations, ensuring compliance with internal policies and standards.
Figure 5. Example: best practice recommendation based on integrated data sources
3. Code Generation
A significant differentiator of CLEAI is its ability to translate optimization recommendations into executable code. It can automatically generate Terraform Infrastructure as Code (IaC) configurations as well as CLI commands and python scripts for cloud resource management. This capability enhances the Cloud Best Practices capability to directly turn suggestions into deployable IaC. The generated code adheres to both AWS and BMW Group’s best practices, enabling rapid and consistent implementation of changes.
Terraform IaC generation
A user can request, “Generate a terraform script for Amazon DynamoDB on-demand capacity,” and CLEAI will produce a complete Terraform configuration. This includes appropriate variable definitions, resource declarations, and BMW Group-specific tagging requirements. The generated code also includes comments explaining key configuration options and best practices for cost optimization, making it easy for engineers to understand, review and test the generated code before deployment.
Figure 6. Example: Terraform Infrastructure as Code generation
Now let’s explore the technical aspects of the solution.
Technical Deep Dive
The capabilities outlined above are enabled through the integration of three BMW Group systems including: 1/ CLEA – a FinOps expert tool with a user interface (UI) providing access to CLEAI assistant. 2/ CLEA API – a modern REST API exposing multiple endpoints to access unified cost and usage data, as well as metadata related to cloud accounts and services. 3/ In-Console Cloud Assistant (ICCA) – a service platform application which provides an orchestration of GenAI agents and powers various GenAI use cases. It enables users to ask natural language questions, process the data provided by the CLEA API, and generate responses that are finally outputted in the CLEAI assistant’s UI. Previously we published a customer story on ICCA, which you can read in the following blog post.
CLEA API Features
CLEA API is one of the key components that enables FinOps use cases. It provides required data and metadata to successfully process customer requests and generate actionable insights. The API possesses the following features:
Endpoints for Cost Data: provide daily costs per account and service, allowing users to track and manage their cloud expenses on a granular level. They help users to access information about cost distribution and usage patterns across various accounts and services.
Cloud Account Metadata: retrieves metadata related to cloud accounts, including information such as BMW department, application ID, product, and other relevant details. This feature enables better organization and classification of cloud resources, aiding in cost allocation and resource management.
Flexible Filtering and Grouping: use of filters and grouping options to customize data queries, making it easier to generate specific reports and insights.
Contextual Access Control: technical users can access cost, usage, and metadata only for the specific applications, organizational units, and departments they have been granted access to. This ensures that users see only the data relevant to their permissions, enhancing security and data integrity.
LLM Chains and User Experience
To precisely address the diverse FinOps use cases and user intents, CLEAI leverages an architecture built upon several specialized LLM chains. These chains are the backbone of our solution, enabling it to interpret complex queries, retrieve vast amounts of data, offer tailored recommendations, and even generate executable code, all while maintaining a natural and intuitive user experience. Let’s delve into the core chains that power CLEAI:
API Chain
At the heart of CLEAI’s data interaction is the API chain, built upon LangGraph. This chain starts when our intent classifier identifies a user’s need for specific information related to cloud applications, accounts, or cost data. Its primary role is to construct and execute POST requests to the CLEA API, and then process the API’s responses. This includes everything from building the correct request URL and body to managing data operations like pagination, performing data aggregations, and calculations to present accurate and actionable insights.
Conversational Chain
The conversational chain is designed to foster a fluid and efficient user experience. It activates when a user seeks clarification or additional details on information already present in the chat history. By leveraging previously retrieved data, this chain facilitates an organic dialogue, allowing users to drill down into existing insights without triggering unnecessary or redundant API calls, while maintaining natural conversation flow.
Clarify Chain
Ambiguity in user requests is a common challenge for any intelligent system. That’s where the clarify chain comes in. When our intent classifier detects an unclear or imprecise user query, this chain is triggered. It prompts the user for further information or clarification, ensuring that CLEAI accurately understands their intent before proceeding, thereby preventing misinterpretations and leading to more precise results.
Retrieval Chain
Finally, the retrieval chain is central to CLEAI’s ability to provide intelligent recommendations, guidance, and code. This chain orchestrates a Retrieval Augmented Generation (RAG) process, utilizing an Amazon Bedrock Knowledge Bases powered by Amazon OpenSearch Serverless as its vector store. This knowledge base is populated with both general AWS best practices and specific BMW Group internal information. The retrieval chain leverages this extensive knowledge to generate highly relevant recommendations, expert guidance, and production-ready code that aligns with both AWS and BMW Group’s stringent best practices.
CLEAI Logical Workflow
The architecture for CLEAI use case expands the developments made within BMW’s In-Console Cloud Assistant (ICCA). The diagram below shows the latest version of ICCA’s architecture, which has been modified to support FinOps use cases. The following diagram demonstrates the core components of the backend architecture featuring the execution of multiple LLM chains.
Figure 7. CLEAI solution architecture
Below we detail out the workflows following the numbers in the diagram:
1. CLEA and ICCA integration: A user simply activates the CLEAI assistant directly within the CLEA expert tool UI. This action establishes a secure connection to the underlying ICCA platform, serving as the gateway for all subsequent interactions.
2. Websocket connection: The websocket connection is managed by the websocket_connection_handler AWS Lambda function. This function is dedicated to handling crucial Websocket connection events, specifically CONNECT and DISCONNECT actions. Critically, it also synchronously triggers the AWS Step Function “Prompt Factory”, initiating the process of preparing the system for user input.
3. Prompt orchestration: To improve response times, we cache essential account information in tailored prompts stored in Amazon DynamoDB at the beginning of each session. The Prompt Factory State Machine is the central orchestrator of this process. This workflow starts by executing the platform_detection Lambda function, which parses the platform name and makes an initial API call to CLEA to retrieve essential account information tailored to the user’s role. Next, the prompt_factory Lambdas corresponding to the detected platform work in parallel using a distributed map. This optimized approach, with one Lambda dedicated to each chain within the platform, ensures all necessary prompts are quickly stored in Amazon DynamoDB, ready for use.
4. Intent classification: To ensure user requests are routed to the most appropriate LLM chain, the system classifies user intent upon receiving their input. This process is managed by the incoming_message_handler Lambda function, which uses a classifier to identify the user’s intent. For example, if a user’s input asks for cloud application cost data, the classifier identifies this as an API call intent.
5. LLM chain selection: To fulfill a user’s request, the system selects the best available LLM chain through a many-to-one classification process. This step is initiated by a Lambda function that invokes Amazon Bedrock. After the user’s intent is classified, a LLM analyzes the request and determines which specialized chain is the most suitable to handle it. This ensures that the most effective and efficient processing path is chosen to fulfill the user’s query.
6. User context and chat history: To provide a truly conversational experience, user context and chat history are maintained. A secondary chat history is used to keep a record of previous user requests. This is particularly valuable for accurately classifying the intent behind follow-up questions, allowing for a seamless and coherent dialogue.
7. Chain execution: With the appropriate chain selected, the chain execution phase takes over. The request_handler Lambda is responsible for executing the chosen chain, ensuring that all necessary operations, from API calls to data processing and generation, are carried out according to the chain’s logic.
8. Response: Finally, once the chain has been correctly executed and the desired information or action completed, the response is generated. This response is then sent back to the user and displayed within the CLEAI UI.
Conclusion
The CLEAI assistant dramatically simplifies cloud financial operations by acting as an intelligent FinOps assistant for five thousand BMW Group users and enabling them to gain actionable insights at unprecedented speed. It enables users to quickly gain deep insights into cloud costs and resource metadata through natural language, eliminating the need for manual navigation and investigation. Beyond analysis, CLEAI provides actionable optimization recommendations, intelligently blending official cloud best practices with BMW Group’s internal guidelines. Crucially, it automates the implementation of these recommendations by generating production-ready Infrastructure as Code, ensuring compliance with AWS and BMW Group best practices and accelerating deployments. This powerful combination of insightful analytics, tailored recommendations, and automated code generation empowers organizations to achieve significant cloud optimization, drive down costs, and maintain robust governance with unprecedented ease and speed.
To learn more about the power of GenAI and how to use it for building differentiated experiences, boosting productivity and innovating faster visit Generative AI on AWS page.




