Amazon EMR Serverless

Run big data applications using open-source frameworks without managing clusters and servers

Why EMR Serverless?

Amazon EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers. You get all the features and benefits of Amazon EMR without the need for experts to plan and manage clusters.

Benefits

Select the open-source framework you want to run for your application, such as Apache Spark and Apache Hive, and EMR Serverless automatically provisions and manages the underlying compute and memory resources.
Run analytics workloads at any scale with automatic on-demand scaling that resizes resources in seconds to meet changing data volumes and processing requirements.
EMR Serverless automatically scales resources up and down to provide just the right amount of capacity for your application. You pay only for what you use, and you can minimize concerns about over- or under-provisioning.

How it works

1

Create your application

Choose the open-source framework and version you want to use.

2

Submit jobs

Submit jobs to your application through APIs or EMR Studio. You can also submit jobs using workflow orchestration services like Apache Airflow or Amazon Managed Workflows for Apache Airflow.

3

Debug jobs

Use familiar open-source tools such as Spark UI and Tez UI to monitor and debug jobs.

Use Cases

As workload demands change, scale application resources seamlessly, without having to preconfigure how much compute power and memory you need.
Choose the option to pre-initialize application resources and enable response time in seconds for SLA-sensitive data pipelines.
Spin up a development and test environment quickly and easily, automatically scale with unpredictable usage, and get products to market faster.