Q: What are the main components of Amazon DataZone?
Amazon DataZone includes four main components:
- Organization-wide catalog: Make data visible with business context for everyone to find and understand data quickly. Catalog data across the organization so you can find and request access to data for analysis.
- Publish/subscribe workflow with access management: Use the automated workflow to help secure data between producers and consumers and ensure the right data is accessed by the right users for the right purpose. Streamline auditing of who is using which datasets for what business use case, with the publishing and subscribing workflow.
- Data projects: Simplify access to AWS analytics by creating business use-case-based groupings of users, data assets, and analytics tools. Amazon DataZone projects provide a collaborative space where project members can collaborate, exchange data, and share artifacts. Projects only allow explicitly added users to access the data and analytics tools. Individual projects manage the ownership of data assets produced within the project, in accordance with policies applied by data stewards, thus decentralizing data ownership through federated governance.
- Portal (outside the AWS Management Console): The Amazon DataZone portal is an integrated data experience for users to promote exploration and drive innovation with a personalized homepage. The portal is an out-of-console experience that facilitates cross-functional collaboration while working with data and analytics tools in a self-service fashion. It verifies existing credentials from your identity provider.
Q: What kind of catalog is the Amazon DataZone catalog?
Amazon DataZone introduces a business metadata catalog. Business metadata provides information authored or used by business people and gives context to organizational data. This could include information such as:
- Ownership: Modern data-centric organizations employ a distributed data stewardship process where LOBs are responsible for managing their own data. A catalog tracks that ownership so interested parties can find and request access to data as part of their business tasks.
- Classification: Data discovery is a key task that business metadata can support. Data discovery uses centrally defined corporate ontologies and taxonomies to classify data sources and allows you to find relevant data objects.
- Relationships: You can use the Amazon DataZone data catalog to add relationship information as metadata. As with a technical dataset schema, the business metadata catalog shows relationships between objects in the catalog, such as those between databases, datasets, and their columns.
Q: What are Amazon DataZone domains?
With domains, you can more securely organize resources aligned to business-driven domains such as LOBs. A domain is a collection of Amazon DataZone objects, such as data assets, projects, associated AWS accounts, and data sources. Domains are a scalable container for you, your team, and related Amazon DataZone entities, including data assets and analytics tools such as Amazon Athena and Amazon Redshift query editors. You can publish a data asset in the catalog with a particular domain that governs the data. You can then control access on their associated AWS accounts and resources that can access that domain. Domains provide a mechanism to instill organizational discipline for teams that are producing and cataloging the data in the business data catalog. You can publish a data asset in the catalog to a particular domain that governs the data and control access of consumers who can access the domain. A domain can have multiple business use-case-driven projects in which people collaborate.
Q: How does Amazon DataZone support and integrate with other AWS services?
Amazon DataZone supports three types of integrations with other AWS services:
- Producer data sources: You can publish data assets to the Amazon DataZone catalog from the data stored in AWS Glue Data Catalog and Amazon Redshift tables and views. You can also manually publish Amazon S3 paths and objects (for example, pictures and directories) to the Amazon DataZone catalog.
- Consumer tools: You can use Amazon Athena or Amazon Redshift Query Editor v2 to access and analyze your data assets.
- Access control and grants: Amazon DataZone supports granting access to AWS Lake Formation managed AWS Glue tables and Amazon Redshift tables and views. For all data assets, Amazon DataZone publishes standard events related to your actions (for example, approval given to a subscription request) to Amazon EventBridge. If Amazon DataZone does not support access management for a specific data asset, you can use these standard events to grant access (for example, IAM-managed Glue tables and Amazon S3 paths). You can also integrate with other AWS services or third-party solutions for custom integrations with these standard events.
Q: What is the Amazon DataZone portal?
Amazon DataZone gives data analysts a unified data management portal to discover, access, prepare, analyze, and report on data across the organization. The portal allows them to easily collaborate with data engineers and IT admins to get insights from their data. Amazon DataZone allows users to consume data assets that are in the business metadata catalog from the Amazon Redshift and Athena query editors. You consume the data assets in a web-based application, which removes the need to log into the AWS console for users who prefer an out-of-console experience.
Q: Which Regions are supported for preview?
For Amazon DataZone preview, the root domain can be provisioned in the AWS Regions of US East (N. Virginia), US West (Oregon), or Europe (Ireland). AWS IAM Identity Center, the successor to AWS Single Sign-On, must be conﬁgured in the same AWS Region as the root domain. You can publish data from any of these Regions to the Amazon DataZone catalog. Users can subscribe to the data and consume it within the same Region as data in AWS analytics services such as Amazon Redshift and Athena.
Explore all the pricing options.
Instantly get access to the AWS Free Tier.
Get started building with Amazon DataZone in the AWS Management Console.