AWS News Blog

New generative AI capabilities for Amazon DataZone further simplify data cataloging and discovery (preview)

Voiced by Polly

Today, we are announcing a preview of an automation feature backed by generative artificial intelligence (AI) for Amazon DataZone that will dramatically decrease the amount of time needed to provide context for organizational data. The new feature can automate the traditionally labor-intensive process of data cataloging. Powered by the large language models (LLMs) of Amazon Bedrock, it generates detailed descriptions of data assets and their schemas, and suggests analytical use cases. You can generate a comprehensive business context with a single click.

We heard from customers that data consumers such as data analysts, scientists, and engineers in organizations struggle to understand the data’s relevance with little metadata. As a result, they either spend more time interpreting the data, or they return to data producers with continued questions. So, data producers such as data owners, engineers, and analysts who own the data and make it available for consumers need to manually enter detailed context for higher-priority data to make data shareable and discoverable. This is time-consuming and the number one problem customers have when trying to collate their data in a system for self-service by consumers.

When we launched the general availability of Amazon DataZone in October 2023, we introduced the first feature that brings generative AI capabilities to automate the generation of the table name and column names of a business catalog asset. In the data portal of Amazon DataZone, the green brain icon indicates automatically generated metadata suggestions. You could accept, edit, or reject each suggestion recommended by Amazon DataZone.

What’s new with today’s preview announcement?
Now, in addition to column and table names, you can automatically generate more detailed descriptions of the table and schema, as well as suggested uses.

In the Business Metadata tab in the data portal, when you choose Generate summary, new content will be generated to explain the table and its metadata.

You can also accept, edit, and reject this recommendation.

When you choose the Schema tab, you can also see new Description recommendations as well as the Name. You can review generated metadata and choose to accept, edit, or reject the recommendation.

This new feature will enhance data discoverability and reduce on back-and-forth communications between data consumers and producers. You will have a richer search experience based on extensive data insights in the future.

Join the preview
The new metadata generation ability is now previewed in the AWS US East (N. Virginia) and US West (Oregon) Regions. With this new generative AI capability, you can reduce time-to-insight by accelerating data cataloging and boosting data discovery. To learn more, visit the Amazon DataZone: Automate Data Discovery.

Give it a try and send feedback to AWS re:Post for Amazon DataZone or through your usual AWS Support contacts.

Channy

Channy Yun

Channy Yun

Channy Yun is a Principal Developer Advocate for AWS, and passionate about helping developers to build modern applications on latest AWS services. A pragmatic developer and blogger at heart, he loves community-driven learning and sharing of technology, which has funneled developers to global AWS Usergroups. His main topics are open-source, container, storage, network & security, and IoT. Follow him on Twitter at @channyun.