Amazon DataZone for Chief Data Officers

Amazon DataZone Introduction

Chief data officers use Amazon DataZone to help their business users, central data governance teams, and IT staff participate in the data governance process. Amazon DataZone can simplify interactions between team members and tools.

Key features

Quickly share data with business teams by increasing visibility and access to data. With Amazon DataZone, data people can go to the data portal, find the data that they need, quickly understand where the data originates, and subscribe to the data feed. All these activities happen in one place. With an approved subscription, data people can bring in the people and tools that they need to work with that data. This simplification reduces the manual steps and decreases work duration from weeks to days.
Decentralize data ownership across your organization, which gives teams autonomy to decide which data that they want to share. Business users can enrich that data with additional context so that end users can better understand the data. Consequently, business units can take control of which data can be shared, without waiting for IT teams to complete their data projects.
Improve data discovery, data understanding, and data usage by enriching the business data catalog. You can reduce your time-to-insight with intuitive search and discovery, comprehensive business data descriptions and context, and recommend usage for common analytical use cases.

Use Cases

Break down silos

Business teams need visibility to effectively use data across the business to improve business processes. Because you have petabytes of data spread across multiple departments, services, on-premises databases, and third-party sources (like partner solutions and public datasets), visibility to all that data can be difficult. Before you can unlock the full value of this data, administrators and data stewards must make it accessible. However, you must maintain control and ensure that the data can only be accessed by the right person and in the right context. With Amazon DataZone, you can empower these individual teams to build their domains and business data catalogs. Curate your data along with built-in generative artificial intelligence (AI) to help enrich the taxonomy in your business data catalog, which can make data more discoverable and understandable.

Make data-driven decisions

Employees across your company (data consumers) want to discover and analyze information from data producers to drive their decision-making. However, you must also control this access to help ensure that data remains secure. This paradox makes it challenging to implement data governance policies that consider the various data, departments, and use cases. With Amazon DataZone, the data consumer finds the information that they need and requests access from the owner. Amazon DataZone can then seamlessly load the data into analytics services. As a result, decision-makers can get the information that they need in a timely manner to make decisions based on the freshest data.

Elevate data discovery and interpretation

Data consumers need detailed descriptions of the business context and documentation about recommended usage to quickly and easily identify the relevant data for their use cases. With AI-generated metadata, they can find more valuable datasets relevant to their use cases and spend less time going back and forth with data producers. When enriched with this metadata, data consumers can understand the data and its relevance for their use case, and avoid misusing the data for a purpose it was not intended for.

Videos

AWS re:Invent 2023 - Modern data governance customer panel (53:46)
AWS re:Invent 2023 - Best practices for analytics and generative AI on AWS (50:13)
AWS re:Invent 2023 - Build an end-to-end data strategy for analytics and generative AI with Fannie Mae (56:21)

FAQ

How does Amazon DataZone establish a balance between business teams and infrastructure teams?

Amazon DataZone creates a usage flywheel driven by data producers (data engineers and data scientists). Data producers securely share data, along with its context, with others in the organization. Data consumers (analysts) then find answers to business questions from the data and share it with others in the organization. This workflow helps customers create a decentralized data ownership and federated governance model for data production and data consumption, where data producers publish, own, and govern their data assets. Data consumers can then access the data that they are interested in after completing the approval workflow with data owners. This helps teams self-serve, removing the chance of being bottlenecked by any particular teams.