Posted On: Jan 11, 2022

Amazon EMR Studio is an integrated development environment (IDE) that makes it easy for data scientists and data engineers to develop, visualize, and debug big data and analytics applications written in R, Python, Scala, and PySpark. Today, we are excited to introduce SQL Explorer, a feature in your EMR Studio Workspace that allows you to browse the data catalog and run SQL queries on EMR clusters from EMR Studio. This release of SQL Explorer in EMR Studio supports running SQL queries on Amazon EMR on EC2 clusters running Presto version 0.254.1 or higher. 

Presto is a fast SQL query engine designed for interactive analytic queries over large datasets from multiple sources. In SQL explorer, you can connect to Amazon EMR on EC2 clusters with Presto installed to view and browse the data catalog. Supported data catalogs include AWS Glue Data Catalog and self-hosted Hive Metastore version 3.1.2 or higher. SQL Explorer also provides you an Editor to run SQL queries, view the query results in a table, and download query results in a csv format. You also have the ability to run multiple SQL statements via different Editor tabs. SQL explorer is supported for Amazon EMR versions 6.4.0+.

EMR Studio is available in US East (Ohio), US East (N. Virginia), US West (Oregon), Canada (Central), Europe (Ireland), Europe (Frankfurt), Europe (London), Europe (Stockholm), Europe (Paris), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo) and South America (Sao Paulo) regions.

To learn more about SQL Explorer in EMR Studio see our documentation here. To see the feature in action see the demo video here.