Amazon EMR Release 4.0.0 With New Versions of Apache Hadoop, Hive, and Spark Now Available

Posted on: Jul 24, 2015

You can now deploy Amazon EMR release 4.0.0 on your Amazon EMR cluster. Amazon EMR 4.0.0 includes Apache Hadoop 2.6.0, Apache Hive 1.0, Apache Pig 0.14, and Apache Spark 1.4.1. These applications can also leverage core Amazon EMR features such as the EMR File System (EMRFS) with support for Amazon S3 server-side and client-side encryption and consistent view, the DynamoDB Storage Handler for Hive, and the Amazon EMR connector for Amazon Kinesis.

With Amazon EMR release 4.0.0, we have introduced a new packaging system and are now using standard Hadoop ports and paths, allowing us to update versions and include new Hadoop ecosystem projects at an even faster rate. Additionally, we’ve streamlined application configuration so you can now directly edit settings for Hadoop applications when creating Amazon EMR clusters instead of using the configure-hadoop bootstrap action. You can also use the new Quick Cluster Configuration experience in the AWS Management Console to further streamline the Amazon EMR cluster creation process.

You can create an Amazon EMR cluster with release 4.0.0 from the AWS Management Console, AWS CLI, or SDK, by choosing release label “emr-4.0.0”. To learn about the new configuration experience for Hadoop applications on Amazon EMR release 4.0.0, click here. To learn about other improvements and the migration path from AMI version 2.x or 3.x to Amazon EMR release 4.0.0, click here. Please visit the Amazon EMR Release Guide for more information about Hadoop 2.6.0, Hive 1.0, Pig 0.14, or Spark 1.4.1.