AWS for Industries
The Stellantis Data & Software Hub: Managing data and software products at scale
The need to manage data and software products at scale has become essential with the rise of generative artificial intelligence (see AWS Executive Insights, 2024), the disruptive transformation from hardware- to software-defined vehicles (see Boston Consulting Group, 2021), and the shift to service-based business models (see McKinsey & Company, 2021). This blog post introduces the Data & Software Hub (DASH), a technology-agnostic framework for securely storing, effectively managing, collaboratively developing, quickly analyzing, openly sharing, and easily deploying data and software products across a global enterprise. Examples from our collaboration with Stellantis (see Amazon press release, 2022) demonstrate how this new paradigm accelerates the digital transformation of Stellantis toward a more data-driven and software-focused mobility company (see AWS blog post, 2024).
Today, legacy organizations face several challenges with respect to data and software assets due to a fragmented and sometimes outdated and equally diverse technology and tool landscape. Some of the most common challenges are depicted in figure 1. Finding assets spread across multiple systems and teams often result in inefficient and non-scalable case-by-case access processes. Organizational silos often obscure assets and limit visibility, discovery, and collaboration. This complexity can hamper innovation and make governance of data and software assets time-consuming and bespoke.
Figure 1. Data and software challenges (top row, blue) and solution features (bottom row, pink)
used to implement DASH
With DASH, customers can better address those challenges and help minimize the risks associated with them while carefully balancing control and freedom in data and software management. DASH can act as a single source of truth (SSOT), provides a common definition and meaning of assets throughout the organization. At the same time, DASH also makes possible multiple versions of truth (MVOT) because it allows for endowing data with business relevance and purpose — sometimes called information (see Harvard Business Review, 2017) — to create data products (see Harvard Business Review, 2022). DASH promotes both, SSOT and MVOT, by allowing users to utilize the same environment for storing assets and contextualizing them for specific use and business cases.
Figure 2. Transformation from use case-driven (left) to capability-driven cloud solution development (right) with capabilities shown as examples. In contrast to use case-specific capabilities (dashed line, blue), shared capabilities (solid line, pink) can be shared between multiple use cases across an organization
Creating, storing, and sharing assets within an enterprise requires two different types of capabilities: 1) shared capabilities, and 2) use case-specific capabilities. Shared capabilities, such as ingestion, governance, and security, reveal a low level of differentiation and are shared across a variety of use cases. In contrast, use case-specific capabilities, like virtualization (see AWS blog post, 2024) and anonymization (see AWS blog post, 2022), reveal a high level of differentiation and cater to unique business needs of users within an organization. DASH develops iteratively, with some use case-specific capabilities becoming shared capabilities as more use cases are added and commonalities discovered over time — an approach known as capability-driven solution development.
Figure 3. High-level DASH framework. DASH allows producers, consumers, and the platform team to access a unified enterprise portal with a seamless UI/UX. The solution provides various use case-specific and shared capabilities, catering to the needs of users across the enterprise
Beyond shared and use case-specific capabilities, DASH also comprises a unified portal that provides producers, consumers, and the platform team with a unified user interface/user experience (UI/UX). A control plane acts as the central management interface and backend of the portal and serves as the “connecting tissue” between components of the solution, including the cloud infrastructure as shown in figure 3.
Solution features
DASH was developed based on five solution features that individually address the data and software challenges shown in figure 1, namely 1) asset catalog, 2) decentralized ownership, self-service exploration, and access request, 3) unified enterprise portal, 4) multi-account and multi-region strategy, and 5) centralized governance with automated workflows.
01 | Asset catalog
The asset catalog provides a unified inventory of assets stored in DASH, including application programming interfaces (APIs), events that can trigger certain workflows, software artifacts, and curated data products for specific use cases. This centralized view helps simplify asset discovery, fosters collaboration, and improves operational efficiency through reusability.
02 | Decentralized ownership, self-service exploration, and access request
DASH is based on the producer-consumer model (see AWS ebook, 2024), which connects producers with consumers while standardizing and providing relevant services needed to create, share, and manage assets through a centralized cloud solution.
In the producer-consumer model depicted in figure 4, producers own and manage assets with domain expertise, while consumers leverage these assets to implement use cases and derive business insights. Producers and consumers are supported by the platform team, which provides the IT infrastructure, tools, and governance frameworks needed. It facilitates a seamless flow of data while ensuring that data-as-a-product is scalable and secure (see Harvard Business Review, 2022).
DASH integrates with various tools from integrated software vendors (ISVs), providing a more seamless and automated infrastructure provisioning for effective collaboration across a company’s network of suppliers and partners.
Figure 4. Schematic diagram of the producer-consumer model used to implement DASH
DASH maximizes the producer-consumer model with self-service capabilities, allowing users to easily discover, explore, and request access to assets and services. This approach helps users simplify sharing, boost collaboration, accelerate innovation, and improve productivity. Some key features include guided registration, self-provisioning of approved infrastructure, automated access and permission management, data usage insights, centralized security and compliance control.
03 | Unified enterprise portal
The unified enterprise portal allows users to interact with DASH and provides producers, consumers, and the platform team with a UI/UX that more seamlessly integrates with features of the solution through use case-specific micro-frontends shown in figure 5. It serves as a web-based user interface and can consolidate assets from multiple domains and teams into a single, enterprise-wide view. This central pane of glass eliminates the need for case-by-case access processes, making asset discovery, sharing, and publishing across the organization more seamless and efficient.
Data producers can register themselves through this portal, helping with traceability, governance, and accountability. They can create new datasets by providing metadata and provisioning necessary resources, facilitating the organization and discoverability of assets. Producers are responsible for ingesting and managing assets within the DASH portal, where they can also monitor the usage and performance of their datasets, optimize their offerings, and maintain high data quality.
Consumers follow a similar journey: They can register themselves to use the available data for building applications and deriving insights. They can also discover relevant datasets using the search functionality and explore or preview data using the data explorer. Consumers can request access to needed datasets, with the solution handling access control and authorization. Once permission is granted, they can access the data directly through APIs, downloads, or integrated analytics tools. The subscription workflow must be approved by the data steward before access is granted.
Figure 5. Screenshot examples from the Stellantis DASH visualize the easy-to-use and seamlessly integrated UI/UX: a) login screen, b) home screen, and c) security center
The enterprise portal offers users a comprehensive suite of functionalities tailored to their specific roles: Producers can register, create, manage, and monitor datasets, ensuring governance and accountability. Consumers can search, preview, and request dataset access, with approvals handled by data stewards to maintain compliance. FinOps users have cost reporting dashboards, operations teams monitor solution health with Grafana, and security users get a unified view of security metrics from AWS services like SecurityHub, GuardDuty, and Inspector, enhancing security oversight.
04 | Multi-account and multi-region strategy
DASH is built for scalability and flexibility, capable of supporting multiple accounts and regions to manage assets across an enterprise. Its multi-account strategy, influenced by functionality and activity realms, segments the solution’s control plane, storage, and security into separate accounts. This approach aligns with the AWS Well-Architected Framework, helping enhance operational excellence, security, reliability, and cost efficiency. Furthermore, it can help minimize data risks and support compliance with residency and privacy standards.
05 | Centralized governance with automated workflows
Data governance refers to a framework or logical structure that guides the implementation of guidelines, protocols, processes, and rules for data in an enterprise (see Data Governance Institute, 2024). The DASH governance model provides a more streamlined, self-service data management framework centered on a producer-consumer model, with stewards ensuring data quality, security, and compliance. Governance within DASH is a collaborative effort involving the following three key personas:
- Producers: Create and manage assets, ensuring they meet governance standards.
- Consumers: Access and use assets, adhering to governance policies.
- Stewards: Oversee and enforce compliance, ensuring integrity and reliability.
Figure 6 depicts the workflow that governs the interaction between producers and consumers with the stewards overseeing the process in DASH. This workflow comprises two components, the producer and consumer workflow to help facilitate efficient, secure data sharing, with stewards maintaining governance throughout the asset lifecycle.
Figure 6. Access and approval workflow for producers (1–3, left) and consumers (A–D, right) on DASH
Producer Workflow
- Register: Producers register, providing asset metadata.
- Approve: Stewards review and approve producer registration.
- Publish: Approved producers create and publish assets for consumption.
Consumer Workflow
- Register: Consumers register.
- Approve: Stewards review and approve consumer registration.
- Subscribe: Consumers request access to data assets.
- Grant Access: Stewards approve subscriptions, enabling consumer access.
Selected use cases and business outcomes
Stellantis has reported onboarding multiple use cases that have already started creating business value. One such use case focused on sharing odometer readings for cars sold in Belgium with, a non-profit monitoring organization. Stellantis used DASH to create a data producer that generated the necessary datasets. By publishing the API and documentation through DASH, we provided secure access to the non-profit while also allowing this API to be used by additional future consumers. DASH’s capability to streamline data sharing and regulatory compliance allowed Stellantis to share vehicle data externally while maintaining internal data protection and monitoring controls. By addressing the challenge of data sharing, Stellantis contributed to the transparency in the used vehicle market in Belgium.
DASH was also used to help seamlessly integrate the video data ingestion and anonymization pipeline and to optimally distribute the associated workloads across a cluster of graphical processing units (GPUs) called ATLAS. For this purpose, we developed API endpoints that act as integration points between DASH and ATLAS, providing the capability for the data providers and consumers to use the GPU capacity available from ATLAS. By combining DASH’s data sharing capabilities with ATLAS’s compute capacity and orchestration, anonymization algorithms can be reused for multiple use cases across Stellantis. Self-service governance helps streamline integration, maximize efficiency, and enable tenants to run simulations and GPU-bound tasks on ATLAS. Additionally, ATLAS has reserved GPU instances, which tenants can utilize through this approach, which may provide significant cost savings, predictable pricing, and guaranteed capacity. This helps users ensure that resources are available when needed and at a lower cost, enhancing the overall efficiency of the solution. Furthermore, the APIs are use case agnostic, meaning they can be utilized for other GPU-bound workloads as well.
Conclusion
In this blog post, we have described how DASH addresses key challenges modern businesses face when it comes to data and software by allowing users to use the same solution for storing and sharing assets and contextualizing them for specific use and business cases. For this purpose, DASH employs the producer-consumer model in combination with self-service capabilities to promote a culture of global collaboration, accountability, and regulatory compliance. This novel approach to data and software management allows for 1) centralized governance and decentralized ownership of assets to help maintain flexibility and speed, 2) convergence of disparate data sources and services in a common enterprise environment, 3) collaboration across distributed teams to prevent the formation of organizational silos, and 4) fast scaling and roll out of shared capabilities across the organization. To this end, DASH serves as an industry- and technology-agnostic framework for effectively and efficiently managing data and software products across a global organization at scale.