Amazon EMR now includes Spark 1.5, Presto, Hue, intelligent resize, and HDFS encryption

Posted on: Sep 30, 2015

You can now deploy new applications on your Amazon EMR cluster and take advantage of intelligent cluster resizing. Amazon EMR release 4.1.0 offers an upgraded version of Apache Spark (1.5.0), Hue 3.7.1 as a GUI for creating and running Hive and Pig workloads, and the Hadoop Key Management Server (KMS) component for transparent encryption in the Hadoop Distributed File System (HDFS). Additionally, you can now easily install and use Presto, Apache Zeppelin, and Apache Oozie on your clusters. We have introduced them as Sandbox Applications in this release, providing early access to applications which are still in development for a full General Availability (GA) release.

We have also introduced an intelligent resize functionality that allows you to reduce the number of nodes in your cluster with minimal impact to running jobs. Additionally, when adding instances to your cluster, Amazon EMR can now start utilizing provisioned capacity as soon it becomes available.

You can create an Amazon EMR cluster with release 4.1.0 by choosing release label “emr-4.1.0” from the AWS Management Console, AWS CLI, or SDK. You can specify Spark, Hue, Presto-Sandbox, Zeppelin-Sandbox, and Oozie-Sandbox to install these applications on your cluster. The Hadoop KMS component is automatically included when installing Hadoop. To learn more about Sandbox Applications, click here. Please visit the Amazon EMR documentation for more information about Spark 1.5.0, HDFS transparent encryption with Hadoop KMS, Hue 3.7.1, and the new resize functionality.