AWS Official Blog

Amazon EC2 Cluster Instances Available on Spot Market

by Jeff Barr | on | in Amazon EC2 | | Comments

Today we are coupling two popular aspects of Amazon EC2: Cluster computing and Spot Instances!

More and more of our customers are finding innovative ways to use EC2 Spot Instances to save up to two-thirds off the On-Demand price. Batch processing, media rendering and transcoding, grid computing, testing, web crawling, and Hadoop-based processing are just a handful of the use cases that are running on Spot today.

For example, researchers at the University of Melbourne and the University of Barcelona are doing vast amounts of data processing for their Belle particle physics experiments on EC2 Spot Instances and realizing a cost savings (when compared to the price of On-Demand Instances) of 56% in the process. Each job starts out small (15-20 EC2 instances) and then scales up to between 20 and 250 instances in the space of four hours. Read more in our new case study.

Scribd has also made very good use of EC2 Spot Instances. As described in the case study, they were able to save 63% (or $10,500) on a large-scale data conversion (From Flash to HTML5) running on over 2,000 EC2 instances at a time. They converted every one of the millions of documents that have been uploaded to the site to HTML5 using a scalable grid comprised of a single master node and multiple slave nodes.

At the same time, our customers have been making really good use of our Cluster Compute and Cluster GPU instances. We’ve seen interesting use cases in a number of fields including molecular dynamics, fluid dynamics, bioinformatics, batch data processing, MapReduce, machine learning, and media rendering. The applications use a variety of coordination strategies and coupling models, ranging from fairly loose to very tight.

The folks at Cycle Computing documented their cluster-building experience in a very informative blog post. They used Cluster GPU instances to create a 32-node, 64-GPU cluster that also includes 8 TB of shared storage. The entire cluster costs less than $82 per hour to operate. They have found that the GPU accelerates overall application performance by a factor of 50 to 60 and note that their success rate in moving internal applications to the GPU is 100%.

Bioproximity provides proteomic analytical services (in plain English, they study protein at the structural and functional level) on a contract basis. In order to do this they need lots of compute power and storage space. Lacking the funds to set up their own compute cluster, they found the AWS pay-as-you-go model to be a perfect fit for their business. They run a large-scale MPI cluster on EC2 with a web-based front end for job submission. Read more in the Bioproximity case study.

On the rendering side, our friends at Animoto have used the Cluster GPU instances to accelerate their video rendering process. The increased throughput allows them to deliver videos more quickly (seconds instead of minutes) and also gives them the ability to support full-on HD video. This article has more information about Animoto and their use of EC2 to generate professional-quality video.

At the same time, our customers are finding innovative ways to use the EC2 Spot Instances to get work done in an economical way.

Effective immediately, you can now use these two features together — you can now submit spot requests for Cluster Compute and Cluster GPU Instances. These instances are currently available in a pair of Availability Zones in the US East (Northern Virginia) Region. You can choose between SUSE Linux Enterprise Server and Amazon Linux AMIs, both of which are now available in HVM form.

You can request the instances using the EC2 Command Line tools, the EC2 APIs, or the AWS Management Console:

We’re looking forward to seeing the new and interesting ways that our customers will use Spot pricing and  Cluster compute instances, alone or (preferably!) together. Here are some of the application areas that should be a good fit:

  • Batch and background processing.
  • Web and data crawling.
  • Financial modeling and analytics.
  • MapReduce and Grid computing.
  • Video processing, especially transcoding.

What can you do with this new combination of features?

— Jeff;