Posted On: Feb 16, 2016

Amazon EMR is a service that allows you to use distributed data processing frameworks such as Apache Hadoop, Apache Spark and Presto to process data. You will now be able to customize storage on your Amazon Elastic Compute Cloud (EC2) instance, running Amazon EMR, by attaching Amazon Elastic Block Store (EBS) volumes to your EC2 instances. You will also be able to launch Amazon EMR clusters using the next-generation M4 and C4 EC2 instance families. Adding EBS volumes to an instance is beneficial if your processing requirements need larger amounts of Hadoop Distributed File System (HDFS) or local storage than what is available by default on an instance; if you want to take advantage of the latest generation EC2 families like the M4, C4 and R3, but are constrained by the storage available on these instance types; or if you want to optimize the storage relative to compute on an Amazon EMR cluster. Amazon EMR supports the Amazon EBS General Purpose SSD (gp2), Magnetic (standard) and Provisioned IOPS (io1) volume types. The added EBS volumes are tied to the lifecycle of the associated instances and augment any existing storage on these instances. If you terminate an Amazon EMR cluster, any associated EBS volumes are also deleted. The EBS volumes used with Amazon EMR will be charged at regular EBS rates. When you terminate the cluster, the EBS volumes are automatically deleted and you stop paying for those volumes. Visit the documentation to learn more.  

You will also be able to launch Amazon EMR clusters with next-generation, EBS-only M4 and C4 EC2 instance families. The M4 and C4 instance families are available in US East (Northern Virginia), US West (Oregon and Northern California), EU (Ireland and Frankfurt), and Asia Pacific (Tokyo, Seoul, Singapore and Sydney). The C4 EC2 instance family is also available in the China (Beijing) region. These instances are designed to deliver the highest level of processor performance on EC2. These instances also offer Enhanced Networking which delivers up to 4 times the packet rate of instances without Enhanced Networking, while maintaining consistent latency, even when under high network I/O. Both the M4 and C4 instances are EBS-Optimized by default, with additional, dedicated network capacity for I/O operations.  

Please see the Amazon EMR pricing page for more details on the M4 and C4 EC2 instance types.