AWS Blog

New – Auto Scaling for EC2 Spot Fleets

The EC2 Spot Fleet model (see Amazon EC2 Spot Fleet API – Manage Thousands of Spot Instances with one Request for more information) allows you to create a fleet of EC2 instances with a single request. You simply specify the fleet’s target capacity, enter a bid price per hour, and choose the instance types that you would like to have as part of your fleet.

Behind the scenes, AWS will maintain the desired target capacity (expressed in terms of instances or a vCPU count) by launching Spot instances that result in the best prices for you. Over time, as instances in the fleet are terminated due to rising prices, replacement instances will be launched using the specifications that result in the lowest price at that point in time.

New Auto Scaling
Today we are enhancing the Spot Fleet model with the addition of Auto Scaling. You can now arrange to scale your fleet up and down based on a Amazon CloudWatch metric. The metric can originate from an AWS service such as EC2, Amazon EC2 Container Service, or Amazon Simple Queue Service (SQS). Alternatively, your application can publish a custom metric and you can use it to drive the automated scaling. Either way, using these metrics to control the size of your fleet gives you very fine-grained control over application availability, performance, and cost even as conditions and loads change. Here are some ideas to get you started:

  • Containers – Scale container-based applications running on Amazon ECS using CPU or memory usage metrics.
  • Batch Jobs – Scale queue-driven batch jobs based on the number of messages in an SQS queue.
  • Spot Fleets – Scale a fleet based on Spot Fleet metrics such as MaxPercentCapacityAllocation.
  • Web Service – Scale web services based on measured response time and average requests per second.

You can set up Auto Scaling using the Spot Fleet Console, the AWS Command Line Interface (CLI), AWS CloudFormation, or by making API calls using one of the AWS SDKs.

I started by launching a fleet. I used the request type Request and maintain in order to be able to scale the fleet up and down:

My fleet was up and running within a minute or so:

Then (for illustrative purposes) I created an SQS queue, put some messages in it, and defined a CloudWatch alarm (AppQueueBackingUp) that would fire if there were 10 or more messages visible in the queue:

I also defined an alarm (AppQueueNearlyEmpty) that would fire if the queue was just about empty (2 messages or less).

Finally, I attached the alarms to the ScaleUp and ScaleDown policies for my fleet:

Before I started writing this post, I put 5 messages into the SQS queue. With the fleet launched and the scaling policies in place, I added 5 more, and then waited for the alarm to fire:

Then I checked in on my fleet, and saw that the capacity had been increased as expected. This was visible in the History tab (“New targetCapacity: 5”):

To wrap things up I purged all of the messages from my queue, watered my plants, and returned to find that my fleet had been scaled down as expected (“New targetCapacity: 2”):

Available Now
This new feature is available now and you can start using it today in all regions where Spot instances are supported.

Jeff;