AWS Big Data Blog
Enhanced data discovery in Amazon SageMaker Catalog with custom metadata forms and rich text documentation
Amazon SageMaker Catalog now supports custom metadata forms and rich text descriptions at the column level, extending existing curation capabilities for business names, descriptions, and glossary term classifications.
With these new features, data stewards can define and capture business-specific metadata directly in individual columns, and authors can use markdown-enabled rich text to provide detailed documentation and business context. Both form fields and formatted descriptions are indexed in real time, making them immediately discoverable through catalog search.
Column-level context is essential for understanding and trusting data. This release helps organizations improve data discoverability, collaboration, and governance by letting metadata stewards document columns using structured and formatted information that aligns with internal standards.
In this post, we show how to enhance data discovery in SageMaker Catalog with custom metadata forms and rich text documentation at the schema level.
Key capabilities
SageMaker Catalog now offers the following key capabilities:
- Custom metadata forms – Data stewards can now use custom metadata forms to capture organization-specific metadata fields for columns such as
Business Owner,Regulatory Classification,Units of Measure, orApproved Use Case. Each field is stored as a key-value pair and indexed for search, enabling business-level queries like “find columns where sensitivity = confidential.” - Rich text (markdown) descriptions – Each column supports a markdown-enabled description field. Authors can format text with headings, bullet lists, and hyperlinks to add deeper business or operational context—for example, logic definitions, sample values, or data lineage references.
- Real-time indexing for search – Custom form values and rich text content are indexed as soon as they are saved. Users can search using a metadata value, keyword, or glossary term across columns.
Solution overview
For this post, we explore a financial services use case. Our example financial services organization defines a column metadata form that includes several fields, as illustrated in the following table.
| Field | Example Value |
| Approved Use Case | Financial revenue modeling |
| Business Owner | Finance Office |
| Domain | RF |
For a dataset column named revenue, the author adds the following markdown description:
When analysts search for Domain = RF, this column appears in results with complete business context.
In the following sections, we demonstrate how to use to use metadata forms for columns and add rich text descriptions that is searchable.
Prerequisites
To test this solution, you should have an Amazon SageMaker Unified Studio domain set up with a domain owner or domain unit owner privileges. You should also have an existing project to publish assets and catalog assets. For instructions to create these assets, see the Getting started guide.
In this example, we created a project named financial_analysis and a test table. To create a similar table, see Get started with Amazon S3 Tables in Amazon SageMaker Unified Studio. To ingest the sample data to SageMaker Catalog and generate business metadata, see Create an Amazon SageMaker Unified Studio data source for Amazon Redshift in the project catalog.
Create new metadata form
Complete the following steps to create a new metadata form:
- In SageMaker Unified Studio, go to your project.
- Under Project catalog in the navigation pane, choose Metadata entities.
- Choose Create metadata form.

- Provide an optional display name, a technical name, and an optional description, then choose Create metadata form.

- Define the form fields. In this example, we add the fields
Domain,Business Owner, andApproved Use Case. - For Requirement Options, select the configuration for each field. For our use case, we select Always required.
- Choose Create field.

- Turn on Enabled so the form is visible and can be used for assets.

Attach metadata form to column
Complete the following steps to attach the metadata form to a column:
- Under Project catalog in the navigation pane, choose Assets.
- Search for and select your asset (for this example, we use the asset
business_finance).

- On the Schema tab, choose View/Edit next to the
revenuefield.

- Choose Add metadata form.

- Choose the form you created and choose Add.

- Add details for the metadata form fields

Add additional context as formatted text
Next, we enter a rich text description for each column using the markdown editor, including headings, bullet lists, links, and sample values. Complete the following steps:
- Choose Edit next to README for the
revenuefield where you added the metadata form.

- Enter details and choose Save.

- Choose Preview to view the formatted README at the column level.

Publish and verify search
Now you’re ready to publish the asset. The metadata form values and markdown descriptions become part of the catalog record and are indexed for search. You can also see the history of revisions on the History tab. Other project users can see the metadata form and rich text description for the published assets and subscribe to the data asset. You can create more data products with these assets, and they will also have the column metadata form and README.

In the catalog search UI, data users can now filter on custom form fields (for example, “Domain = RF”) or search in natural language for text that matches the column description.

Best practices
Consider the following best practices when using this feature:
- Define metadata forms aligned with your business vocabulary (domains, owners, sensitivity levels) proactively before publishing assets at scale.
- Make column descriptions actionable—include business definitions, value ranges, logic, update cadence, and dependencies.
- Verify the catalog indexing is timely; publish changes proactively so search results reflect new metadata.
- Use governance controls. You can combine column-level metadata with existing asset-level templates and approval workflows to enforce publishing standards.
- Monitor search usage and metadata completeness; target high-value datasets for complete column-level documentation first.
- Do not store confidential or sensitive information in your metadata forms.
Conclusion
With column-level metadata forms and rich text descriptions, SageMaker Catalog helps organizations deliver higher-quality metadata, stronger governance, and better data discovery. These features make it straightforward for teams to capture complete business context and for analysts to quickly locate and understand the data they need.
Custom metadata forms and rich text descriptions at the column level are now available in AWS Regions where SageMaker is supported.
To learn more about SageMaker, see the Amazon SageMaker User Guide. Get started with this capability, refer to the user guide.