AWS Big Data Blog

Respond to State Changes on Amazon EMR Clusters with Amazon CloudWatch Events

Jonathan Fritz is a Senior Product Manager for Amazon EMR

Customers can take advantage of the Amazon EMR API to create and terminate EMR clusters, scale clusters using Auto Scaling or manual resizing, and submit and run Apache Spark, Apache Hive, or Apache Pig workloads. These decisions are often triggered from cluster state-related information.

Previously, you could use the “describe” and “list” set of API operations to find the relevant information about your EMR clusters and associated instance groups, steps, and Auto Scaling policies. However, programmatic applications that check resource state changes and post notifications or take actions are forced to poll these API operations, which provides a slower end-to-end reaction time and additional management overhead than if you were able to use an event-driven architecture.

With new support for Amazon EMR in Amazon CloudWatch Events, you can be notified quickly and programmatically respond to state changes in your EMR clusters. Additionally, these events are also displayed in the Amazon EMR console, on the Cluster Details page in the Events section.

There are four new EMR event types:

  • Cluster State Change
  • Instance Group State Change
  • Step State Change
  • Auto Scaling State Change

CloudWatch Events allows you to create filters and rules to match these events and route them to Amazon SNS topics, AWS Lambda functions, Amazon SQS queues, streams in Amazon Kinesis Streams, or built-in targets. You then have the ability to programmatically act on these events, including sending emails and SMS messages, running retry logic in Lambda, or tracking the state of running steps. For more information about the sample events generated for each event type, see the CloudWatch Events documentation.

The following is an example using the CloudWatch Events console to route EMR step failure events to Lambda for automated retry logic and to SNS to push a notification to an email alias:

Cloudwatch_1

You can create rules for EMR event types in the CloudWatch Events console, AWS CLI, or the AWS SDKs using the CloudWatch Events API.

If you have any questions or would like to share an interesting use case about events and notifications with EMR, please leave a comment below.


Related

Dynamically Scale Applications on Amazon EMR with Auto Scaling

AutoScaling_SocialMedia