Why aren't my EMR Spot Instances being provisioned during a cluster resize?

2 minute read
0

My Amazon EMR Spot Instances aren't being provisioned during a resize of my EMR cluster.

Resolution

Amazon Elastic Compute Cloud (Amazon EC2) might interrupt your Spot Instance at any time for the following reasons:

  • Lack of Spot capacity.
  • The request constraints can't be met.
  • The Spot Price is higher than the designated maximum price.
  • Your Spot account quota is exhausted. If this is the case, then you can request an increase.

For more information, see Why did Amazon EC2 interrupt my Spot Instance?

Note: It's a best practice to use Spot Instances for workloads that are stateless, fault-tolerant, and flexible enough to withstand interruptions.

Also, Spot Instances and On-Demand Instances might not be resized because the bootstrap scripts were modified or contain errors.

Check the logs for the bootstrap script at /emr/instance-controller/log/bootstrap-actions or s3://cluster_id/node-failed/bootstrap-actions/stderr.gz. The logs show the error STARTUP_SCRIPT_FAILED_RET_CODE.

For example, the following bootstrap action log shows that bootstrap action 1 (emr_bootstrap_actions.sh) failed:

Another app is currently holding the yum lock; waiting for it to exit...
  The other application is: yum
    Memory : 125 M RSS (444 MB VSZ)
    Started: Tue Jul 19 05:36:36 2022 - 00:03 ago
    State  : Running, pid: 7914
Error: Package: falcon-sensor-4.18.0-6403.amzn2.x86_64 (/falcon-sensor-4.18.0-6403.amzn2.x86_64)
           Requires: systemd

If you see the preceding error, then the following actions happen:

  • All of the new replacement nodes terminate.
  • The node stops provisioning new replacement instances.
  • The core node instance group goes into arrested mode as shown in the following example:
"state": "ARRESTED",
  "message": "Instance group ig-2JN5xxxxxxxx in Amazon EMR cluster j-37H4xxxxxxx (emr-xxxxx-spark-cluster) was arrested at  for the following reason: Error provisioning instances."
=====

Related information

Spot Instance interruptions

Spot request status

Spot Instance best practices

Why is my Spot Instance terminating even though the maximum price is higher than the Spot price?

AWS OFFICIAL
AWS OFFICIALUpdated a year ago