AWS Partner Network (APN) Blog

Aligning Business Intelligence and AI/ML with a Data Mesh Platform on AWS

By Dan Domkowski, Data Practice Lead – Nuvalence

Nuvalence-AWS-Partners-2023
Nuvalence
Connect with Nuvalence-1

Data mesh is emerging as a paradigm for generating data-driven value and is gaining real-world adoption within industries like financial services and automotive.

One of the principles behind data mesh is the notion of a self-serve data platform that abstracts high-level functions into a set of capabilities and experiences for discovering data and building data-driven products.

Consumers of these data products can be analytical applications, but analytical consumption is ultimately driven by personas like business intelligence (BI) analysts or data scientists. An analyst persona, for example, can align the organization around business domains, or a common set of objects and data that represent value to the business.

Too many organizations knowingly (or unknowingly) silo their analytic teams based on their approaches, methodologies, and toolsets. By creating experiences for analytic teams to aggregate and align data that is driven by the common domains they share, you can foster an environment of collaboration between analytic teams with complementary approaches to data discoveries.

In my experience as the leader of the Data-Driven Digital Platform delivery practice at Nuvalence, I work with data, analytical, and business leaders charged with answering business questions from their data assets. Some organizations ask questions about historical data to gauge past performance, while others (or sometimes the same organization) want to predict trends for their future planning. In either case, we strive to design and build an architecture that allows for organizations to continuously ask questions using a repeatable approach.

Nuvalence is an AWS Advanced Tier Services Partner and next-generation agency using a product-driven approach to help enterprises define and build digital and data mesh platforms.

This post walks through the user journeys of two types of data consumers in a mesh platform:

  • BI analysts who answer known questions and key performance indicators (KPIs) about the business.
  • Data scientists who use advanced analytic techniques such as artificial intelligence (AI) and machine learning (ML) to answer questions the business has yet to ask itself.

I will then talk about how BI and AI/ML overlap within a set of data domains, and how a platform architecture further enables the desired experiences within a data mesh.

Furthermore, by utilizing cloud technologies on Amazon Web Services (AWS), you can build a platform experience that allows analytic teams to autonomously consume data in an interoperable environment and complement their discoveries so the business benefits from all analytical insights.

BI and ML: Overlapping But Unique Approaches to Data Analysis

Data domains are ideally aligned to the objectives and KPIs of the business. In turn, data consumers commonly aggregate data from multiple domains to be able to answer questions about the business, such as “What are our projections?” and “What influences customer behaviors?”

A “known” method or formula for asking and answering these questions is something already defined by business stakeholders, like a KPI or business metric you routinely want to track and/or evolve. BI analysts use existing methods to determine trends, generate reports, and provide input to leaders in charge of defining questions to ask.

An “unknown” method is something you have yet to define, but are looking to develop through techniques like AI/ML. In fact, AI/ML is ideal for discovering far more precise and sometimes nonobvious methods over larger data sets than traditional analytic methodologies.

You can think of AI/ML as “automation towards advanced method creation” because it helps define more intelligent formulas for your data in a quicker amount of time than manual analysis. Once an AI/ML model is served in production, it becomes a “known” method while some data scientists and machine learning engineers work with analysts and the business to make continuous improvements to production models, and others work on the training and creation of new models.

The historical challenge has always revolved around getting these approaches to data analysis, BI, and AI/ML to work together to accelerate similar or complementary outcomes. While their approaches are unique, their lifecycles can inform one another, accelerating both of their results and ultimately positively impacting the business.

Applying Domain-Driven Design, popularized by Eric Evans in his book, aligns different analytic approaches around common business domains.

Domain Data and the Data Mesh Platform

In her book, Zhamak Dehghani applies domain thinking along with a unique set of principles to data that in turn introduces the paradigm of data mesh, which shows significant promise and creates collaborative environments out of disparate data sources.

The principles and architectures written about by Dehghani can seem insurmountable for many organizations, which is why it’s best to start with incremental adoption of a data mesh platform.

Rather than thinking of a platform as only an asset or only a product (because it is inclusive of those things), you can think of it as a framework that emphasizes and even enforces the desired behaviors (such as Domain-Driven Design and Autonomous Team Delivery) and capabilities (like open interfaces and aggregated data) of data mesh.

The platform exposes a common set of experiences for data discoverability, data lineage, interface management, and analytic tooling so that all work related to platform interaction is continuous and consistent.

By aligning data around domains (the context for applicability and value brought to the business) and making data shareable and discoverable, teams of data consumers can discover, analyze, and measure data-driven value based on a common context. In other words, they share the ecosystem of objects, data, and platform resources, and they share the same domain models, experiences, and ultimately share outcomes.

Shared Experiences and Outcomes

While BI analysts are working away building reports to answer business questions with “known” methods, data scientists are relying heavily on domain knowledge, data processing, and feature engineering that can typically be revealed by BI outputs and workflows.

Defining what is “known” may be the first step in the journey toward discovering the complementary “unknown.” Conversely, once data scientists serve a model into production for, let’s say, transactional predictions, and the business realizes its value, the model becomes a “known” method. The results from those predictions make their way into analytic reports and dashboards informed by business analysis.

The diagram below shows the interrelationship between BI and AI/ML within the context of common Domains A and B. Business dashboards and predictive models are both examples of analytical products that can inform one another even though the questions they answer (or ask) could be quite different. The important takeaways are that they share domains, and more importantly, share knowledge.

Nuvalence-AI-ML-Data-Mesh-1

Figure 1 – Feedback loop between BI and AI/ML.

You can imagine the scenario of an ecommerce website trying to understand more about its users and their online shopping behaviors. Domain A can represent ‘Customer’ and all the information the site knows about them (address, past orders, returns/exchanges). On the other hand, Domain B reflects a merchandise category such as ‘Clothing’ (shirts, jackets, pants).

By applying Domain-Driven Design to data domains, you can begin to share a common, ubiquitous language with all teams involved: BI analysts, data scientists, product managers, software developers, marketing, and more.

In this example, terminology and definitions around purchase history, customer behavior classifications, and seasonal sales projections can become common knowledge. As a result, data scientists conducting feature engineering can more easily extract features from existing dashboards that are visualizing purchasing trends to make predictions for future purchasing, accelerating both BI reporting and ML model training outcomes.

Once domain models are established, you can begin work on incrementally building an ecosystem of services and experiences (a data mesh platform) that reinforces and empowers teams to row in the same direction.

Incremental Adoption of a Data Mesh Platform with AWS

With the emphasis that data mesh puts on interoperable data products, a platform backed by integrated data-driven AWS services allows teams to connect and share data across domains. When it comes to delivering data products that make transformations from applications within a domain, AWS Glue, and AWS container orchestration services create mechanisms for loosely coupled, distributed solutions for serving analytical data to consumers.

To discover data sources, track data lineage, and learn more about data products in the mesh platform ecosystem, AWS Glue Data Catalog serves as a metadata repository and critical node for a mesh framework. You can even populate your Data Catalog by crawling data products through the integration with Amazon Athena.

Complementarily, Amazon API Gateway allows consumers to discover data product interfaces backed by APIs, and it gives mesh platform owners (those accountable for the delivery of the platform experiences to other stakeholders) a means for securing, monitoring, and measuring interfaces.

As cross-domain analytical data is aggregated, you can place events on a data stream through Amazon Kinesis, which allows you to analyze the events, process the data further, or develop additional products that read off the stream.

Once data consumers are prepared to merge data from disparate sources and multiple domains, BI analysts using Amazon QuickSight and data scientists training models with Amazon SageMaker can share integrated data sets stored in Amazon Redshift.

Analytic interoperability with AWS makes it easier for data scientists to extract features from “known” method types and promotes knowledge sharing and analytic method development, reinforcing the feedback loop between analytics teams.

Nuvalence-AI-ML-Data-Mesh-2

Figure 2 – Data Mesh Platform on AWS.

Conclusion

The promises of data mesh pose an exciting future for organizations looking to maximize their value from data.

A self-serve platform is not a single tool, but an ecosystem of capabilities and experiences that can be compiled together with complementary services provided by AWS. When orchestrated together, balancing team autonomy with interoperability, organizations can accelerate toward data-driven decisions with a data mesh.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

.
Nuvalence-APN-Blog-Connect-1
.


Nuvalence – AWS Partner Spotlight

Nuvalence is an AWS Advanced Tier Services Partner and next-generation agency using a product-driven approach to help enterprises define and build digital and data mesh platforms.

Contact Nuvalence | Partner Overview