AWS Blog

Building Price-Aware Applications Using EC2 Spot Instances

by Jeff Barr | on | in Amazon EC2, EC2 Spot Instances | | Comments

Last month I began writing what I hope to be a continuing series of posts of EC2 Spot Instances by talking about some Spot Instance Best Practices. Today I spoke to two senior members of the EC2 Spot Team to learn how to build price-aware applications using Spot Instances. I met with Dmitry Pushkarev (Head of Tech Development) and Joshua Burgin (General Manager) and would like to recap our conversation in interview form!

Me: What does price really mean in the Spot world?

Joshua: Price and price history are important considerations when building Spot applications. Using price as a signal about availability helps our customers to deploy applications in the most available capacity pools, reduces the chance of interruption and improves the overall price-performance of the application.

Prices for instances on the Spot Market are determined by supply and demand. A low price means that there is a more capacity in the pool than demand. Consistently low prices and low price variance means that pool is consistently underutilized. This is often the case for older generations of instances such as m1.small, c1.xlarge, and cc2.8xlarge.

Me: How do our customers build applications that are at home in this environment?

Dmitry: It is important to architect your application for fault tolerance and to make use of historical price information. There are probably as many placement strategies as there are customers, but generally we see two very successful use patterns: one is choosing capacity pools (instance type and availability zone) with low price variance and the other is to distribute capacity across multiple capacity pools.

There is a good analogy with the stock market – you can either search for a “best performing” capacity pool and periodically revisit your choice or to diversify your capacity across multiple uncorrelated pools and greatly reduce your exposure to risk of interruption.

Me: Tell me a bit more about these placement strategies.

Joshua:  The idea here is to analyze the recent Spot price history in order to find pools with consistently low price variance. One way to do this is by ordering capacity pools by duration of time that elapsed since the last time Spot price exceeded your preferred bid – which is the maximum amount you’re willing to pay per hour. Even though past performance certainly doesn’t guarantee future results it is a good starting point.  This strategy can be used to make bids on instances that can be used for dev environments and long running analysis jobs. It is also good for adding supplemental capacity to Amazon EMR clusters. We also recommend that our customers revisit their choices over time in order to ensure that they continue to use the pools that provide them with the most benefit.

Me: How can our customers access this price history?

Dmitry: It’s available through the console as well as programmatically through SDKs and  the AWS Command Line Interface (CLI).

We’ve also created a new web-based Spot Bid Advisor that can be accessed from the Spot page. This tool presents the relevant statistics averaged across multiple availability zones making it easy to find instance types with low price volatility. You can choose the region, operating system, and bid price (25%, 50%, or 100% of On-Demand) and then view historical frequency of being outbid for last week or a month.

Another example can be found in the aws-spot-labs repo on GitHub. The script demonstrates how spot price information can be obtained programmatically and used to order instance types and availability zones based on the duration since the price last exceeded your preferred bid price.

Me: Ok, and then I pick one of the top instance pools and periodically revisit my choice?

Dmitry: Yes, that’s a great way to get started. As you get more comfortable with Spot typically next step is to start using multiple pools at the same time and distribute capacity equally among them. Because capacity pools are physically separate, prices often do not correlate among them, and it’s very rare that more than one capacity pool will experience a price increase within a short period of time.

This will reduce the impact of interruptions and give you plenty of time to restore the desired level of capacity.

Joshua: Distributing capacity this way also improves long-term price/performance: if capacity is distributed evenly across multiple instance types and/or availability zones then the hourly price is averaged across multiple pools which results in really good overall price performance.

Me: Ok, sounds great.  Now let’s talk about the second step, bidding strategies.

Joshua: It is important to place a reasonable bid at a price that you are willing to pay. It’s better to achieve higher availability by carefully selecting multiple capacity pools and distributing your application across the instances therein than by placing unreasonably high spot bids. When you see increasing prices within a capacity pool, this is a sign that demand is increasing. You should start migrating your workload to less expensive pools or shut down idle instances with high prices in order to avoid getting interrupted.

Me: Do you often see our customers use more sophisticated bidding tactics?

Dmitry: For many of our customers the ability to leverage Spot is an important competitive advantage and some of them run their entire production stacks on it – which certainly requires additional engineering to hit their SLA. One interesting way to think about Spot is to view is it as a significant reward for engineering applications that are “cloud friendly.”  By that I mean fault tolerant by design, flexible, and price aware. Being price aware allows the application to deploy itself to the pools with the most spare capacity available. Startups in particular often get very creative with how they use Spot which allows them to scale faster and spend less on compute infrastructure.

Joshua: Tools like Auto Scaling, Spot fleet, and Elastic MapReduce offer Spot integration and allow our customers to use multiple capacity pools simultaneously without adding significant development effort.

Stay tuned for even more information about Spot Instances! In the meantime, please feel free to leave your own tips (and questions) in the comments.