Posted On: Apr 19, 2023

AWS Lake Formation and the Glue Data Catalog now extend data cataloging, data sharing and fine-grained access control support for customers using a self-managed Apache Hive Metastore (HMS) as their data catalog. Previously, customers had to replicate their metadata into the AWS Glue Data Catalog in order use Lake Formation permissions and data sharing capabilities. Now, customers can integrate their HMS metadata within AWS, allowing them to discover data alongside native tables in the Glue data catalog, manage permissions and sharing from Lake Formation, and query data using AWS analytics services.

To get started, customers using this feature will need to connect their HMS databases and tables as federation objects into their AWS Glue Data Catalog. Customers can then grant Lake Formation column, tag, and data filter permissions on tables as if they were native AWS Glue Data Catalog tables. These permissions are then applied whenever those tables are queried by Lake Formation supported AWS services, simplifying the management of unified data access controls. Finally, customers can audit access and permissions on their HMS resources using AWS CloudTrail logs generated on all data and metadata access events.

To help customers get started querying their HMS resources, AWS Lake Formation provides an open source Serverless Application Model (SAM) application to provision the required resources.

For additional details, please refer to Managing permissions on datasets that use external metastores. This feature is available in all AWS regions where AWS Lake Formation is available.