How can I configure automatic scaling in Amazon EMR?
Last updated: 2020-09-18
I want to use Amazon Elastic Compute Cloud (Amazon EC2) Auto Scaling on an Amazon EMR cluster.
- Amazon EMR versions 5.30.0, 6.1.0, and later: Use EMR managed scaling. Or, use automatic scaling with a custom policy for instance groups.
- Amazon EMR versions 4.0.0-5.29.0 and 6.0.0: Use automatic scaling with a custom policy for instance groups.
Amazon EMR versions 5.30.0, 6.1.0, and later
If you're using Amazon EMR 5.30.0, 6.1.0, or later versions, you have two options for automatic scaling: Enable EMR managed scaling to automatically increase or decrease the number of instances or units in your cluster based on workload. Or, use automatic scaling with a custom policy for instance groups, as explained in the following section.
Amazon EMR versions 4.0.0 and later
- Follow the steps at Using automatic scaling with a custom policy for instance groups. For information about the Amazon CloudWatch metrics that you can use for automatic scaling in Amazon EMR, see Monitor metrics with CloudWatch. The following are two commonly used metrics for automatic scaling:
YarnMemoryAvailablePercentage: This is the percentage of remaining memory that's available for YARN.
ContainerPendingRatio: This is the ratio of pending containers to allocated containers. You can use this metric to scale a cluster based on container-allocation behavior for varied loads. This is useful for performance tuning.
- To confirm that the scaling policy is attached to the instance group, choose Events from the navigation pane.
- Check for automatic scaling policy events.