Amazon EMR

Amazon EMR on Amazon EKS

Why EMR on EKS?

Amazon EMR on Amazon EKS enables you to submit Apache Spark jobs on demand on Amazon Elastic Kubernetes Service (EKS) without provisioning clusters. With EMR on EKS, you can consolidate analytical workloads with your other Kubernetes-based applications on the same Amazon EKS cluster to improve resource utilization and simplify infrastructure management.

Until now, you had to choose between using EMR to manage Apache Spark on EC2 or self-managing Apache Spark on Amazon EKS. When you use EMR on EC2, the EC2 instances are dedicated to EMR. When you self-manage Apache Spark on EKS, you need to manually install, manage, and optimize Apache Spark to run on Kubernetes.

With Amazon EMR on Amazon EKS, you can share compute and memory resources across all of your applications and use a single set of Kubernetes tools to centrally monitor and manage your infrastructure. You can also use a single EKS cluster to run applications that require different Apache Spark versions and configurations, and take advantage of automated provisioning, scaling, faster runtimes, and development and debugging tools that EMR provides.

Benefits

You get the same EMR benefits for Apache Spark on EKS that you get on EC2 today. This includes fully managed versions of Apache Spark 2.4 and 3.0, automatic provisioning, scaling, performance optimized runtime, and tools like EMR Studiofor authoring jobs and an Apache Spark UI for debugging.

With EMR on EKS, your compute resources can be shared between your Apache Spark applications and your other Kubernetes applications. Resources are allocated and removed on demand to eliminate over-provisioning or under-utilization of these resources, enabling you to lower costs as you only pay for the resources you use.

By running analytics applications on EKS, you can reuse existing EC2 instances in your shared Kubernetes cluster and avoid the startup time of creating a new cluster of EC2 instances dedicated for analytics. You can also get 3x faster performance running performance optimized Spark with EMR on EKS compared to standard Apache Spark on EKS.

Use cases

With EMR on EKS, you can automate the provisioning, management, and scaling of Apache Spark, and use a single set of tools to centrally manage and monitor your infrastructure.

Run multiple EMR workloads that require different frameworks, versions, and configurations on the same EKS cluster as your other application workloads.

EMR on EKS provides a managed experience for developing, troubleshooting, and optimizing your analytics. You can deploy configurations and start jobs in seconds to test new EMR versions on the same EKS cluster without allocating dedicated resources.

Resources

Video

AWS Online Tech Talk

Run Spark on Kubernetes with Amazon EMR on Amazon EKS.

Watch the Video

Blog

Orchestrate an Amazon EMR on Amazon EKS Spark job with AWS Step Functions.

read the blog

Templates

Ready-to-deploy templates

These templates include recommended Kubernetes add-ons and best practices for running production-grade EMR on EKS workloads. You can use these templates to minimize the time needed to setup your production stacks or Proof-of-Concepts.

View data-on-eks templates

Get started with Amazon EMR on Amazon EKS

Pricing

Learn more about Amazon EMR pricing

Visit the pricing page

Console

Ready to build?

Get started with Amazon EMR

Amazon EMR on Amazon EKS

Why EMR on EKS?

Benefits

Use cases

Resources

AWS Online Tech Talk

Blog

Ready-to-deploy templates

Get started with Amazon EMR on Amazon EKS

Learn more about Amazon EMR pricing

Ready to build?

Learn

Resources

Developers

Help

Amazon EMR on Amazon EKS

Why EMR on EKS?

Benefits

Simplify management

Reduce costs

Optimize performance

Use cases

Centralize resource management

Co-location of workloads

Rapid adoption of new EMR versions

Resources

AWS Online Tech Talk

Blog

Ready-to-deploy templates

Get started with Amazon EMR on Amazon EKS

Learn more about Amazon EMR pricing

Ready to build?

Learn

Resources

Developers

Help