Posted On: Dec 21, 2018

You can now use Apache Spark 2.4.0 and Hue 4.3.0 on Amazon EMR release 5.20.0. Spark 2.4.0 adds several new features and updates, including support for a new scheduling model called barrier execution mode that provides better integration with deep learning workloads, several new built-in SQL functions for ease of handling complex data types like arrays and maps, and native support for reading and writing Avro data formats. Hue 4.3.0 includes improvements to SQL exploration, improvements to job scheduling and monitoring, better dashboard layouts, and several bug fixes.

Additionally, with this release, you can use the upgraded versions of Apache Hive 2.3.4, Apache Flink 1.6.2, Apache HBase 1.4.8, Apache MXNet 1.3.1, Apache Tez 0.9.1, TensorFlow 1.12.0, and Presto 0.214.

You can create an Amazon EMR cluster with the release 5.20.0 by choosing the release label “emr-5.20.0” from the AWS Management Console, AWS CLI, or SDK. You can choose Spark, Hue, Hive, Flink, HBase, MXNet, Tez, TensorFlow, and Presto to install these applications when you launch your EMR cluster. Please visit the Amazon EMR documentation for more information about EMR release 5.20.0, Spark 2.4.0, Hue 4.3.0, Hive 2.3.4, Flink 1.6.2, HBase 1.4.8, MXNet 1.3.1, Tez 0.9.1, and Presto 0.214.

Amazon EMR release 5.20.0 is now available in all supported regions for Amazon EMR.

You can stay up to date on EMR releases by subscribing to the RSS feed for EMR release notes. Use the RSS icon at the top of the EMR Release Guide to link the feed URL directly to your favorite feed reader.