Posted On: Jun 1, 2022

We are happy to announce the general availability of Amazon EMR Serverless, a new serverless deployment option in Amazon EMR that makes it easy and cost effective for data engineers and analysts to run petabyte-scale data analytics in the cloud. Amazon EMR is a big data solution that you can use to run large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications built on open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto. With EMR Serverless, you can run your Spark and Hive applications without having to configure, optimize, tune, or manage clusters.

EMR Serverless offers fine-grained automatic scaling, which provisions and quickly scales the compute and memory resources required by the application. For example, if a Spark job needs 2 executors for the first 5 minutes, 10 executors for the next 10 minutes, and 5 executors for the last 20 minutes, EMR Serverless automatically provides the resources as needed, and you pay for only the resources used. EMR Serverless also includes the performance-optimized EMR runtime so your jobs run quickly. Additionally, EMR Serverless integrates with EMR Studio to provide you with comprehensive tooling to check the status of running jobs, review job history, and use familiar open-source tools to debug jobs.

Amazon EMR Serverless is generally available in four Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).

Click here to read the EMR Serverless blog post, and refer to the EMR Serverless documentation for more details.