AWS News Blog

Category: Amazon EMR

AWS Week In Review - June 6, 2022

AWS Week In Review – June 6, 2022

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS! I’ve just come back from a long (extended) holiday weekend here in the US and I’m still catching up on all the AWS launches that happened this past week. I’m […]

Amazon EMR Serverless Now Generally Available – Run Big Data Applications without Managing Servers

At AWS re:Invent 2021, we introduced three new serverless options for our data analytics services – Amazon EMR Serverless, Amazon Redshift Serverless, and Amazon MSK Serverless – that make it easier to analyze data at any scale without having to configure, scale, or manage the underlying infrastructure. Today we announce the general availability of Amazon […]

New – Create and Manage EMR Clusters and Spark Jobs with Amazon SageMaker Studio

Today, we’re very excited to offer three new enhancements to our Amazon SageMaker Studio service. As of now, users of SageMaker Studio can create, terminate, manage, discover, and connect to Amazon EMR clusters running within a single AWS account and in shared accounts across an organization—all directly from SageMaker Studio. Furthermore, SageMaker Studio Notebook users […]

Customize and Package Dependencies With Your Apache Spark Applications on Amazon EMR on Amazon EKS

Last AWS re:Invent, we announced the general availability of Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS), a new deployment option for Amazon EMR that allows customers to automate the provisioning and management of Apache Spark on Amazon EKS. With Amazon EMR on EKS, customers can deploy EMR applications on the same Amazon EKS […]

New – Amazon EMR on Amazon Elastic Kubernetes Service (EKS)

Tens of thousands of customers use Amazon EMR to run big data analytics applications on frameworks such as Apache Spark, Hive, HBase, Flink, Hudi, and Presto at scale. EMR automates the provisioning and scaling of these frameworks and optimizes performance with a wide range of EC2 instance types to meet price and performance requirements. Customer […]

New – Using Step Functions to Orchestrate Amazon EMR Workloads

AWS Step Functions allows you to add serverless workflow automation to your applications. The steps of your workflow can run anywhere, including in AWS Lambda functions, on Amazon Elastic Compute Cloud (Amazon EC2), or on-premises. To simplify building workflows, Step Functions is directly integrated with multiple AWS Services: Amazon Elastic Container Service (Amazon ECS), AWS […]

New – Insert, Update, Delete Data on S3 with Amazon EMR and Apache Hudi

Storing your data in Amazon S3 provides lots of benefits in terms of scale, reliability, and cost effectiveness. On top of that, you can leverage Amazon EMR to process and analyze your data using open source tools like Apache Spark, Hive, and Presto. As powerful as these tools are, it can still be challenging to deal with use cases where […]

New – Amazon EMR Instance Fleets

Today we’re excited to introduce a new feature for Amazon EMR clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision […]