Apache Spark 1.6.1, new versions of Apache Hadoop and Presto, and support for Amazon S3 SSE-KMS now available on Amazon EMR

Posted on: Apr 4, 2016

You can now use upgraded versions of Apache Spark (1.6.1), Apache Hadoop (2.7.2), and an upgraded sandbox release of Presto (0.140) on Amazon EMR release 4.5.0. Spark 1.6.1 was released in the community on March 9th, and it contains several bug fixes and updates to the Dataset API. Additionally, the EMR Filesystem (EMRFS) can now read objects from and write objects to Amazon S3 with S3 server-side encryption with AWS Key Management Service keys (SSE-KMS). Previously, EMRFS supported S3 server-side encryption with S3 managed keys (SSE-S3) and S3 client-side encryption with AWS KMS keys or custom keys.

You can create an Amazon EMR cluster with release 4.5.0 by choosing release label “emr-4.5.0” from the AWS Management Console, AWS CLI, or SDK. You can specify Spark, Hadoop, or Presto-Sandbox to install these applications on your cluster. You can enable Amazon S3 SSE-KMS in EMRFS in the “Create Cluster - Advanced Options” section in the AWS Management Console or directly in the EMRFS configuration. Please visit the Amazon EMR documentation for more information about release 4.5.0, Spark 1.6.1, Hadoop 2.7.2, Presto 0.140, and support for Amazon S3 SSE-KMS in EMRFS.