Amazon EMR now supports Apache Spark SQL to insert data into and update Apache Hive metadata tables when Apache Ranger integration is enabled

Posted on: Oct 6, 2021

We are announcing the support of using Apache Spark SQL to update Apache Hive metadata tables when using Amazon EMR integration with Apache Ranger.

This January, we launched Amazon EMR integration with Apache Ranger, a feature that allows you to define and enforce database, table, and column-level permissions when Apache Spark users access data in Amazon S3 through the Hive Metastore. Previously, with Apache Ranger is enabled, you were limited to only being able to read data using Spark SQL statements such as SHOW DATABASES and DESCRIBE TABLE. Now, you can also insert data into, or update the Apache Hive metadata tables with these statements: INSERT INTO, INSERT OVERWRITE, and ALTER TABLE.

This feature is enabled on Amazon EMR 6.4 in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), (Milan), Europe (Stockholm), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Hong Kong), Asia Pacific (Tokyo), Asia Pacific (Sydney), South America (São Paulo), Middle East (Bahrain), and Africa (Cape Town).

To get started, see the following list of resources:·

AWS Big Data Blog post:

Amazon EMR Management Guide:

Using Apache Spark SQL with Apache Ranger plugin

Amazon EMR now supports Apache Spark SQL to insert data into and update Apache Hive metadata tables when Apache Ranger integration is enabled

Learn

Resources

Developers

Help