AWS Compute Blog

Powering your Amazon ECS Cluster with Amazon EC2 Spot Instances

This post was graciously contributed by:

Chad Schmutzer, Solutions Architect Shawn O'Conner, Enterprise Solutions Architect
Chad Schmutzer
Solutions Architect
Shawn O’Connor
Solutions Architect

Today we are excited to announce that Amazon EC2 Container Service (Amazon ECS) now supports the ability to launch your ECS cluster on Amazon EC2 Spot Instances directly from the ECS console.

Spot Instances allow you to bid on spare Amazon EC2 compute capacity. Spot Instances typically cost 50-90% less than On-Demand Instances. Powering your ECS cluster with Spot Instances lets you reduce the cost of running your existing containerized workloads, or increase your compute capacity by two to ten times while keeping the same budget. Or you could do a combination of both!

Using Spot Instances, you specify the price you are willing to pay per instance-hour. Your Spot Instance runs whenever your bid exceeds the current Spot price. If your instance is reclaimed due to an increase in the Spot price, you are not charged for the partial hour that your instance has run.

The ECS console uses Spot Fleet to deploy your Spot Instances. Spot Fleet attempts to deploy the target capacity you request (expressed in terms of instances or a vCPU count) for your containerized application by launching Spot Instances that result in the best prices for you. If your Spot Instances are reclaimed due to a change in Spot prices or available capacity, Spot Fleet also attempts to maintain its target capacity.

Containers are a natural fit for the diverse pool of resources that Spot Fleet thrives on. Spot Fleet enable you to provision capacity across multiple Spot Instance pools (combinations of instance types and Availability Zones), which helps improve your application’s availability and reduce operating costs of the fleet over time. Combining the extensible and flexible container placement system provided by ECS with Spot Fleet allows you to efficiently deploy containerized workloads and easily manage clusters at any scale for a fraction of the cost.

Previously, deploying your ECS cluster on Spot Instances was a manual process. In this post, we show you how to achieve high availability, scalability, and cost savings for your container workloads by using the new Spot Fleet integration in the ECS console. We also show you how to build your own ECS cluster on Spot Instances using AWS CloudFormation.

Creating an ECS cluster running on Spot Instances

You can create an ECS cluster using the AWS Management Console.

  1. Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
  2. In the navigation pane, choose Clusters.
  3. On the Clusters page, choose Create Cluster.
  4. For Cluster name, enter a name.
  5. In Instance configuration, for Provisioning model, choose Spot.

ECS Create Cluster - Spot Fleet

Choosing an allocation strategy

The two available Spot Fleet allocation strategies are Diversified and Lowest price.

ECS Spot Allocation Strategies

The allocation strategy you choose for your Spot Fleet determines how it fulfills your Spot Fleet request from the possible Spot Instance pools. When you use the diversified strategy, the Spot Instances are distributed across all pools. When you use the lowest price strategy, the Spot Instances come from the pool with the lowest price specified in your request.

Remember that each instance type (the instance size within each instance family, e.g., c4.4xlarge), in each Availability Zone, in every region, is a separate pool of capacity, and therefore a separate Spot market. By diversifying across as many different instance types and Availability Zones as possible, you can improve the availability of your fleet. You also make your fleet less sensitive to increases in the Spot price in any one pool over time.

Spot Fleet Market

You can select up to six EC2 instance types to use for your Spot Fleet. In this example, we’ve selected m3, m4, c3, c4, r3, and r4 instance types of size xlarge.

Spot Instance Selection

You need to enter a bid for your instances. Typically, bidding at or near the On-Demand Instance price is a good starting point. Your bid is the maximum price that you are willing to pay for instance types in that Spot pool. While the Spot price is at or below your bid, you pay the Spot price. Bidding lower ensures that you have lower costs, while bidding higher reduces the probability of interruption.

Configure the number of instances to have in your cluster. Spot Fleet attempts to launch the number of Spot Instances that are required to meet the target capacity specified in your request. The Spot Fleet also attempts to maintain its target capacity if your Spot Instances are reclaimed due to a change in Spot prices or available capacity.

The latest ECS–optimized AMI is used for the instances when they are launched.

Configure your storage and network settings. To enable diversification and high availability, be sure to select subnets in multiple Availability Zones. You can’t select multiple subnets in the same Availability Zone in a single Spot Fleet.

The ECS container agent makes calls to the ECS API actions on your behalf. Container instances that run the agent require the ecsInstanceRole IAM policy and role for the service to know that the agent belongs to you. If you don’t have the ecsInstanceRole already, you can create one using the ECS console.

If you create a managed compute environment that uses Spot Fleet, you must create a role that grants the Spot Fleet permission to bid on, launch, and terminate instances on your behalf. You can also create this role using the ECS console.

That’s it! In the ECS console, choose Create to spin up your new ECS cluster running on Spot Instances.

Using AWS CloudFormation to deploy your ECS cluster on Spot Instances

We have also published a reference architecture AWS CloudFormation template that demonstrates how to easily launch a CloudFormation stack and deploy your ECS cluster on Spot Instances.

The CloudFormation template includes the Spot Instance termination notice script mentioned earlier, as well as some additional logging and other example features to get you started quickly. You can find the CloudFormation template in the Amazon EC2 Spot Instances GitHub repo.

Give it a try and customize it as needed for your environment!

Spot Fleet Architecture

Handling termination

With Spot Instances, you never pay more than the price you specified. If the Spot price exceeds your bid price for a given instance, it is terminated automatically for you.

The best way to protect against Spot Instance interruption is to architect your containerized application to be fault-tolerant. In addition, you can take advantage of a feature called Spot Instance termination notices, which provides a two-minute warning before EC2 must terminate your Spot Instance.

This warning is made available to the applications on your Spot Instance using an item in the instance metadata. When you deploy your ECS cluster on Spot Instances using the console, AWS installs a script that checks every 5 seconds for the Spot Instance termination notice. If the notice is detected, the script immediately updates the container instance state to DRAINING.

A simplified version of the Spot Instance termination notice script is as follows:

#!/bin/bash

while sleep 5; do
  if [ -z $(curl -Isf http://169.254.169.254/latest/meta-data/spot/termination-time) ]; then
    /bin/false
  else
    ECS_CLUSTER=$(curl -s http://localhost:51678/v1/metadata | jq .Cluster | tr -d \")
    CONTAINER_INSTANCE=$(curl -s http://localhost:51678/v1/metadata \
      | jq .ContainerInstanceArn | tr -d \")
    aws ecs update-container-instances-state --cluster $ECS_CLUSTER \
      --container-instances $CONTAINER_INSTANCE --status DRAINING
  fi
done

When you set a container instance to DRAINING, ECS prevents new tasks from being scheduled for placement on the container instance. If the resources are available, replacement service tasks are started on other container instances in the cluster. Container instance draining enables you to remove a container instance from a cluster without impacting tasks in your cluster. Service tasks on the container instance that are in the PENDING state are stopped immediately.

Service tasks on the container instance that are in the RUNNING state are stopped and replaced according to the service’s deployment configuration parameters, minimumHealthyPercent and maximumPercent.

ECS on Spot Instances in action

Want to see how customers are already powering their ECS clusters on Spot Instances? Our friends at Mapbox are doing just that.

Mapbox is a platform for designing and publishing custom maps. The company uses ECS to power their entire batch processing architecture to collect and process over 100 million miles of sensor data per day that they use for powering their maps. They also optimize their batch processing architecture on ECS using Spot Instances.

The Mapbox platform powers over 5,000 apps and reaches more than 200 million users each month. Its backend runs on ECS, allowing it to serve more than 1.3 billion requests per day. To learn more about their recent migration to ECS, read their recent blog post, We Switched to Amazon ECS, and You Won’t Believe What Happened Next. Then, in their follow-up blog post, Caches to Cash, learn how they are running their entire platform on Spot Instances, allowing them to save upwards of 50–90% on their EC2 costs.

Conclusion

We hope that you are as excited as we are about running your containerized applications at scale and cost effectively using Spot Instances. For more information, see the following pages:

If you have comments or suggestions, please comment below.