Posted On: Dec 15, 2023

We are excited to announce that high-availability EMR on EC2 clusters are now also available with instance fleets configuration. Your high-availability instance fleet EMR cluster will have three on-demand primary nodes and support Hadoop applications like YARN Resource Manager, HDFS Name Node, and Spark. In the event a primary node fails or critical processes like Yarn Resource Manager and NameNode crash, EMR fails over to one of the remaining primary nodes in the cluster.

Amazon EMR is a cloud big data platform for data processing, interactive analysis, and machine learning using open-source frameworks such as Apache Spark, Presto and Trino, and Apache Flink. With the support for multiple primary nodes, EMR eliminates the possibility of a single point of failure for your cluster. This improves fault tolerance and enables uninterrupted operation of your cluster. With this launch, you get the improved instance diversity of instance fleets configuration with your high-availability EMR clusters.

Customers can now launch high availability instance fleet clusters with Amazon EMR versions 5.36.1, 6.8.1, 6.9.1, 6.10.1, 6.11.1, 6.12, and higher. This capability is available in in all regions where Amazon EMR on EC2 is available. To learn more about the supported applications and their failover process, see our documentation. To launch a high-availability EMR on EC2 cluster, visit Plan and configure primary nodes page.