Posted On: Oct 16, 2018
AWS Glue Data Catalog is a managed metadata repository integrated with Amazon EMR, Amazon Athena, Amazon Redshift Spectrum, and AWS Glue ETL. The Data Catalog simplifies metadata management and provides automatic schema discovery and schema version history. With Amazon EMR, you can use the Data Catalog as the default metastore for Spark, Presto, and Hive instead of using an on-cluster or self-managed Hive Metastore. With the recent release of resource-based policies and resource-level permissions for the Data Catalog, you can restrict or allow EMR access to catalog objects such as databases and tables. The release also allows EMR clusters in different accounts to access a single Data Catalog, enabling cross-account access. Amazon S3 policies continue to govern access to data stored in Amazon S3, with Data Catalog policies adding another layer of protection. Learn more