Posted On: Jun 7, 2022
Amazon EMR release 6.6 now supports Apache Spark 3.2, Apache Spark RAPIDS 22.02, CUDA 11, Apache Hudi 0.10.1, Apache Iceberg 0.13, Trino 0.367, and PrestoDB 0.267. You can use the performance-optimized version of Apache Spark 3.2 on EMR on EC2, EKS, and recently released EMR Serverless. In addition Apache Hudi 0.10.1 and Apache Iceberg 0.13 are available on EC2, EKS, and Serverless. Apache Hive 3.1.2 is available on EMR on EC2 and EMR Serverless. Trino 0.367 and PrestoDB 0.267 are only available on EMR on EC2.
Each Amazon EMR release version uses a default Amazon Linux 2 (AL2) Amazon Machine Image (AMI) for Amazon EMR. Prior to Amazon EMR 6.6, the default AMI was based on the latest and up-to-date Amazon Linux AMI available at the time of the Amazon EMR release. Therefore, the Amazon EMR release version was "locked" to its respective assigned AL2 AMI. This means that any new updates to AL2 were not automatically updated, unless you moved to the next Amazon EMR release or manually install them. With Amazon EMR 6.6 and subsequent releases, every time you launch an EMR on EC2 cluster, Amazon EMR automatically uses the latest AL2 release. See our documentation to learn more.
With Amazon EMR release 6.6 and later, applications that use Log4j 1.x and Log4j 2.x will be upgraded to use Log4j 1.2.17 (or higher) and Log4j 2.17.1 (or higher) respectively, and will not require using the bootstrap actions provided above to mitigate the CVE issues.
Further, with Amazon EMR release 6.6, cluster startup time on EMR on EC2 has improved by 80 seconds on average for clusters that use Amazon EMR default AMI option, and install common applications like Apache Hadoop, Apache Spark, and Apache Hive.