AWS for Industries
How Data Mesh Technology Can Help Transform Risk, Finance, and Treasury Functions in Banks
In our last blog, How Cloud-based Data Mesh Technology Can Enable Financial Regulatory Data Collection, we set out an approach whereby financial institutions can share data with regulators in a manner that enables ongoing operational flexibility and the evolution of regulators’ data requirements. By building a regulatory data mesh from AWS Data Exchange nodes, participants avoid being forced into ongoing lock-step data schema consistency. Rather, each participant can independently and incrementally adopt changes: such as in the use case of banks and regulators, ongoing changes to reporting requirements.
Within a bank, the flow of accurate data (such as from trading desks, business units, or operating subsidiaries to risk, finance, and treasury) is critical to effective decision making, pricing, and the optimization of scarce resources such as capital, funding, and liquidity. This data not only provides the basis for internal business decision making, but also external regulatory reporting. Yet the efficient consolidation of accounting and business data from across trading, banking, and leasing asset classes remains an ongoing challenge.
The potential impact to an organization’s cost base of inefficient data flow is material. A recent study highlighted that knowledge workers expend approximately 40% of their efforts on searching for and gathering data. The Bank of England has highlighted 57% of regulatory reporting resources is associated with process flow, due to the highly manual processes that have been implemented by banks. McKinsey estimated UK banks spend between GBP2 billion – GBP4.5 billion annually to meet these undifferentiated mandatory reporting requirements.
In this blog, we highlight how the same data mesh concepts are equally capable of improving the flow of data inside of banks (and other financial institutions). The flows we focus on here are between subsidiaries, business units, or individual trading desks, and corporate control functions such as risk, finance, and treasury (RFT). Without robust, scalable, and flexible mechanisms to preserve context, consistency, quality, lineage, and governance and ownership, trust in data is all too frequently derived from those responsible for collecting and preparing it, rather than the data itself. This drives unnecessary cost in RFT functions caused by process complexity and often greater analytical focus on what the data is rather than what insight can be derived from it.
The Importance of Data Boundaries
In addressing these issues, AWS customers have rediscovered the innate relationship between data boundaries and organizational structure.
In the most general sense, a “boundary” separates the internal functions and components of an entity from the environment within which the entity is embedded: for example, as a cell wall provides the boundary between the cell’s internal structure and the external body. The concepts of “high-cohesion” and “loose-coupling” always accompany the concept of boundary. High cohesion is achieved when all components and mechanisms that are required for the entity to function as a standalone unit, are situated within the entity’s boundary. Loose-Coupling is achieved when external parties are shielded from the entity’s internal implementation by the boundary: the boundary only allowing services earmarked for external use to be visible to the external parties.
Adhering to these concepts, AWS Well-Architected best practices, suggests that business units should be mapped as distinct bounded entities to the underlying Cloud infrastructure (for example AWS landing zones/AWS accounts). See Figure 1.
Figure 1: Mapping Bank XYZ’s corporate structure to AWS
Taking banking as the example, Figure 1 indicates the organization structures made explicit when onboarding to AWS. Following Well-Architected best practices, the business units are mapped as distinct bounded entities to the underlying AWS Cloud infrastructure (for example AWS landing zones/AWS accounts). The benefits of doing this include:
- Security controls – Applications in different business units might have different security profiles, requiring different control policies and mechanisms around them.
- Isolation – An account is a unit of security protection. Potential risks and security threats should be contained within an account without affecting others. There could be different security needs that require you to isolate one account from one another, whether due to multiple teams or a different security profile.
- Data isolation – Isolating data stores to an account limits the number of people that can access and manage that data store.
- Many teams – Different teams have their different responsibilities and resource needs. They should not over-step one another in the same account.
- Business process – Different business units or products might have different purposes and processes. Establish different accounts to serve business-specific needs.
For example, in its AWS re:Invent 2020 presentation “Nationwide’s journey to a governed data lake on AWS,” Nationwide shows data processing and cataloging aligned to its business units, with a centralized data discovery service providing discovery of these federated data sources.
Without a suitable mechanism, the data boundaries between the business units remain implicit and do not provide sufficient structure, on their own, to enhance trust in the data’s flow between its producers and consumers. The inclusion of AWS Data Exchange crystallizes these data boundaries – thereby creating an intra-organizational data-mesh that helps to address this issue.
Leveraging AWS Data Exchange mechanisms, each business unit (data producer) may publish data when deemed ready, albeit within an agreed reporting schedule, maintaining a high level of internal cohesion. When notified of a published change by AWS Data Exchange, each data consumer may pull published data as required; this can be done without coordination with the data producers (they are said to be loosely coupled). As each published AWS Data Exchange dataset is self-describing, (see AWS Data Exchange data templates), lock step coordination of schema changes across the population of data producers and data consumers is not required. Rather, each data consumer’s ETL pipeline can determine the schema of each dataset consumed and adapt it to their specific requirements, as required. This process is further simplified via integration between AWS Data Exchange and AWS DataBrew tooling.
AWS Data Exchange is fully integrated with AWS Identity and Access Management (IAM) so it has the governance and security tooling available to provide fine-grained controls over to who has access to data (on both producer and consumer sides) and who can modify it. Automated audit trails generated by AWS CloudTrail may also be used to further improve process transparency. Also, as the data publishers are de-coupled from each other and the data consumers, they may consume, clean, and curate data using whatever processes and technologies they prefer; the only requirement being that they publish via AWS Data Exchange.
From a business perspective, the benefits of an intra-organisational data mesh can be summarized as:
- Each business unit (operating unit, subsidiary, trading desk, finance, risk, treasury) is an independent autonomous data publisher and/or consumer.
- Each data publisher is responsible for the consistency and quality of the datasets they publish.
- The action of publishing a dataset is an explicit decision taken by the owner; and each published dataset has a version.
- If published datasets subsequently must be updated, then they are updated and re-published with a new version number.
- The published datasets have explicit owners, and data changes are explicitly communicated via the change in version: in other words, the data consumer can trust the data.
While no upfront coordination is required by business units participating in the intra-organisational data mesh – over time, the data-mesh will, via the removal of unnecessary data schema diversity, evolve towards an optimal set of data schemas for the organization. Crucial though, at any point in time the data mesh can accommodate a new participant with divergent data: such as data systems from a newly acquired business.
Operationalizing Data, Delivering Business Benefits
Data mesh based architectures simplify the flow of data between different data producers and consumers and provide the basis for more agile and adaptive organizations. The approach results in business data that has:
- A clear context
- An identifiable owner
- Is consistent for all participating parties
- At the structure level of the data mesh, has a clear data lineage
The commonality of approach allows data producers to embedded values, or metadata, at source, which enables data consumers to understand it and transform it, to meet their requirements using business rules. This can be as simple as providing the regulatory definitions and references linked to calculated ratios. It is these rules that can be adapted and changed to meet new business requirements, thereby operationalizing the data at source. Further value can be added to source data from the business by defining and embedding additional standards, such as BCBS239 and SOx regulatory reporting requirements into the schema to enable downstream compliance and reporting as a part of the data flow, rather than in addition to it.
An Incremental Approach that Reduces Implementation Risk
The multiple consolidation points across risk, finance, and treasury have traditionally led to significant effort spent on re-pipping data from product and calculation systems into a single data warehouse or data lake. This frequently leads to difficult decisions of whether to replace or replicate existing product or line-of-business data warehouses, and frequently requires significant capital expenditure on infrastructure with extended implementation and payback horizons.
Building these capabilities as data meshes, reduces the “all or nothing” implementation risk associated with traditional program delivery methods that focus on re-piping all data flows at once. Instead, cloud-based technology enables trusted data flows to be built on a use-case-by-use-case basis with technology and business teams working in partnership to redefine the processes they support, and iterate during delivery to refine and adapt to business needs as required. This flexible and more agile implementation approach enables financial institutions to more reliably retire legacy technology and deliver the investment business case that underpins a cloud migration.
The alignment of organizational boundaries with data boundaries and data flow, we suggest, minimizes organizational complexity and maximizes efficiency. This minimum complexity / maximum efficiency principle is directly related to Conway’s Law, which states “organizations design systems that mirror their own communication structure and hierarchy.”
Cloud-based technologies are helping banking customers look differently at many of the operational challenges that they face, and at the same time provide them with the tools to become more data-centric organizations. The business value comes from the application of the technology to the processes and activities they support.
Data meshes, in this example, can enable a change in the role that RFT professionals can play in their organizations. In the same way that AWS customers use the cloud to remove undifferentiated heavy lifting of technology, data mesh solutions can also enable highly experienced and qualified finance professionals to spend more time on analytical activities to understand and optimize cost, capital, liquidity, and funding usage. This in turn helps those organizations to do more with the limited resources that they have available.
To learn more about how the cloud can help to enable improvements in risk, finance, and treasury data flow, contact:
Financial Services Specialist, London
Principal Solutions Architect, London
Worldwide Banking Specialist, New York