AWS News Blog

Category: Amazon EMR

Customize and Package Dependencies With Your Apache Spark Applications on Amazon EMR on Amazon EKS

Last AWS re:Invent, we announced the general availability of Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS), a new deployment option for Amazon EMR that allows customers to automate the provisioning and management of Apache Spark on Amazon EKS. With Amazon EMR on EKS, customers can deploy EMR applications on the same Amazon EKS […]

Read More

New – Amazon EMR on Amazon Elastic Kubernetes Service (EKS)

Tens of thousands of customers use Amazon EMR to run big data analytics applications on frameworks such as Apache Spark, Hive, HBase, Flink, Hudi, and Presto at scale. EMR automates the provisioning and scaling of these frameworks and optimizes performance with a wide range of EC2 instance types to meet price and performance requirements. Customer […]

Read More

New – Using Step Functions to Orchestrate Amazon EMR Workloads

AWS Step Functions allows you to add serverless workflow automation to your applications. The steps of your workflow can run anywhere, including in AWS Lambda functions, on Amazon Elastic Compute Cloud (Amazon EC2), or on-premises. To simplify building workflows, Step Functions is directly integrated with multiple AWS Services: Amazon Elastic Container Service (Amazon ECS), AWS […]

Read More

New – Insert, Update, Delete Data on S3 with Amazon EMR and Apache Hudi

Storing your data in Amazon S3 provides lots of benefits in terms of scale, reliability, and cost effectiveness. On top of that, you can leverage Amazon EMR to process and analyze your data using open source tools like Apache Spark, Hive, and Presto. As powerful as these tools are, it can still be challenging to deal with use cases where […]

Read More

New – Amazon EMR Instance Fleets

Today we’re excited to introduce a new feature for Amazon EMR clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision […]

Read More

Human Longevity, Inc. – Changing Medicine Through Genomics Research

Human Longevity, Inc. (HLI) is at the forefront of genomics research and wants to build the world’s largest database of human genomes along with related phenotype and clinical data, all in support of preventive healthcare. In today’s guest post, Yaron Turpaz, Bryan Coon, and Ashley Van Zeeland talk about how they are using AWS to […]

Read More

Additional At-Rest and In-Transit Encryption Options for Amazon EMR

Our customers use Amazon EMR (including Apache Hadoop and the full range of tools that make up the Apache Spark ecosystem) to handle many types of mission-critical big data use cases. For example: Yelp processes over a terabyte of log files and photos every day. Expedia processes streams of clickstream, user interaction, and supply data. […]

Read More