AWS Big Data Blog
Enforce business glossary classification rules in Amazon SageMaker Catalog
Organizations are scaling their data catalogs faster than ever. Maintaining consistent metadata standards across teams remains a challenge. Business glossaries define the language of the enterprise—terms like Customer Profile, Transaction, or Confidential Data—but assets are often published without these classifications, leading to inconsistent metadata and poor discoverability.
To address this, Amazon SageMaker Catalog now supports metadata enforcement rules for glossary terms classification (tagging) at the asset level. With this capability, administrators can require that assets include specific business terms or classifications. Data producers must apply required glossary terms or classifications before an asset can be published. This enforces metadata consistency across the catalog and makes sure assets carry the business context needed for effective discovery and governance.
This capability builds on existing metadata rule features for enforcing required metadata fields during asset publishing. The new addition extends those rules to cover glossary term validation, strengthening the link between business language and technical data assets.
In this post, we show how to enforce business glossary classification rules in SageMaker Catalog.
Why metadata enforcement matters
A common governance challenge is the lack of standardized tagging and classification for assets entering enterprise catalogs. Without enforcement, data producers might publish assets missing required business terms (such as data sensitivity level or product domain), resulting in inconsistent metadata that confuses business users, unreliable search and filtering results, and manual cleanup and downstream compliance risks.
By automatically validating metadata at publish time, SageMaker Catalog validates metadata when assets are published. This offers the following key benefits:
- Assets are classified with approved business terms before publication
- Validation supports compliance with internal glossary and classification standards
- Consistent tagging enhances search accuracy and reduces noise
- Incomplete or incorrectly tagged assets don’t reach consumers
How metadata enforcement works
On the Amazon SageMaker Unified Studio console, administrators navigate to Catalog, Governance, Rules and create metadata rules targeting the asset publishing workflow. Rules can specify required glossary terms or classification fields (for example, Business Unit, PII Category, or Data Sensitivity). Rules can apply organization-wide or within specific domains or projects.
When a producer attempts to publish an asset, SageMaker Catalog checks that the asset includes the required glossary terms or classifications. If any required metadata is missing, the publish action fails with a clear error message. After the metadata is added, the asset can be published successfully.
Enforced tagging makes sure published assets can be searched and filtered using consistent business terminology, improving catalog usability for analysts and business users.
Solution overview
For this post, we explore a financial services use case. Our example a financial services company defines a rule requiring all datasets published from the project to have ‘Finance’ glossary associated:
- A data producer attempting to publish a new dataset without this tag receives a validation error
- After applying the correct classification, the dataset publishes successfully
- Analysts can now filter the catalog to find only
Financedatasets or join assets consistently tagged with the same glossary term
In the following sections, we walk through the steps to configure this solution. We create a rule that all assets published from a specific project should have a business unit tag called Finance.
Prerequisites
To test this solution, you should have a SageMaker Unified Studio domain set up with a domain owner or domain unit owner privileges. You should also have an existing project to publish assets and catalog assets. For instructions to create these assets, see the Getting started guide.
In this example, we created a project named financial_analysis and a test table. For instructions to create a table, see Get started with Amazon S3 Tables in Amazon SageMaker Unified Studio. To ingest the sample data to SageMaker Catalog and generate business metadata, see Create an Amazon SageMaker Unified Studio data source for Amazon Redshift in the project catalog.
Create glossary and add terms
Complete the following steps to create a new glossary and add terms:
- In SageMaker Unified Studio, on the Discover menu, choose Glossaries.

- Choose Create glossary.

- Provide details for your glossary, including name, owning project, and optional description.
- For Glossary restriction, turn on Enabled.
- Choose Create.

- Create the term
Financein theBusiness Unit Detailsglossary.

Create rule to enforce glossary terms
Complete the following steps to create a rule to define glossary terms:
- On the Govern menu, choose Domain units.

- On the Rules tab, choose Add.

- Add a publishing rule for the
Financeproject to have theFinancetag for all assets published to the catalog. - Choose Add rule.

The following screenshot shows the configuration details for your new rule.

Publish asset with enforced rules
Complete the following steps to publish your asset with the enforced rules:
- On the
financial_analysisproject page, go to your asset. - In the Glossary terms section, choose Add terms.

If you choose Publish without adding the needed term, you get an error stating theFinanceterm should be assigned.

- Choose Finance to add the required term.

- Choose Publish asset.

The following screenshot shows the published asset and the required terms in the glossary.

Conclusion
With metadata enforcement rules for glossary terms, SageMaker Catalog brings stronger control and consistency to how organizations publish and manage their data assets. By requiring approved business classifications before publication, teams can make sure assets adhere to enterprise metadata standards, improving governance, discoverability, and trust in shared catalogs. This capability helps organizations scale their catalog governance without adding manual overhead—embedding compliance and quality directly into the publishing workflow.
Metadata enforcement rules for glossary terms are available in AWS Regions where SageMaker Catalog operates. Get started with this capability, refer to the user guide.