AWS News Blog

Category: Amazon EMR*

New – Amazon EMR Instance Fleets

Today we’re excited to introduce a new feature for  clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision On-Demand […]

Read More

Human Longevity, Inc. – Changing Medicine Through Genomics Research

Human Longevity, Inc. (HLI) is at the forefront of genomics research and wants to build the world’s largest database of human genomes along with related phenotype and clinical data, all in support of preventive healthcare. In today’s guest post, Yaron Turpaz,  Bryan Coon, and Ashley Van Zeeland talk about how they are using AWS to […]

Read More

Additional At-Rest and In-Transit Encryption Options for Amazon EMR

Our customers use (including Apache Hadoop and the full range of tools that make up the Apache Spark ecosystem) to handle many types of mission-critical big data use cases. For example: Yelp processes over a terabyte of log files and photos every day. Expedia processes streams of clickstream, user interaction, and supply data. FINRA analyzes […]

Read More

Amazon EMR 5.0.0 – Major App Updates, UI Improvements, Better Debugging, and More

The team has been cranking out new releases at a fast and furious pace! Here’s a quick recap of this year’s launches: EMR 4.7.0 – Updates to Apache Tez, Apache Phoenix, Presto, HBase, and Mahout (June). EMR 4.6.0 – HBase for realtime access to massive datasets (April). EMR 4.5.0 – Updates to Hadoop, Presto; addition […]

Read More

Amazon EMR Update – Support for EBS Volumes, and M4 & C4 Instance Types

My colleague Abhishek Sinha wrote the guest post below to tell you about the latest additions to . Amazon EMR is a service that allows you to use distributed data processing frameworks such as Apache Hadoop, Apache Spark and Presto to process data on a managed cluster of EC2 instances. Newer versions of EMR (3.10 […]

Read More

EMR 4.3.0 – New & Updated Applications + Command Line Export

My colleague Jon Fritz wrote the blog post below to introduce you to some new features of . — Jeff; Today we are announcing Amazon EMR release 4.3.0, which adds support for Apache Hadoop 2.7.1, Apache Spark 1.6.0, Ganglia 3.7.2, and a new sandbox release for Presto (0.130). We have also enhanced our maximizeResourceAllocation setting […]

Read More

New – Launch Amazon EMR Clusters in Private Subnets

My colleague Jon Fritz wrote the guest post below to introduce you to an important new feature for . — Jeff; Today we are announcing that Amazon EMR now supports launching clusters in Amazon Virtual Private Cloud (VPC) private subnets, allowing you to quickly, cost-effectively, and securely create fully configured clusters with Hadoop ecosystem applications, […]

Read More