Amazon EMR + EC2 Spot
AWS Cloud
Get started with EC2 Spot

Amazon EC2 Spot Instances are unused Amazon EC2 capacity; the price you pay is determined by the supply and demand for Spot Instances. The cost of using Spot Instances can be up to 90% less than using On-Demand Instances. With Spot, you specify the max price you are willing to pay per instance-hour. While the Spot price is below or equal to your max price, you pay the Spot price. If your instance is reclaimed due to an increase in the Spot price above your max price, you will not be charged for the partial hour that your instance has run. With Spot you can significantly reduce the cost of running your Hadoop or Spark clusters, increase your compute capacity and throughput without increasing your budget, or both.

Introduction to EC2 Spot Instances

Running your Amazon EMR clusters on Amazon EC2 Spot instances is now even easier.
With Amazon EMR Instance Fleets you can provide a list of up to 5 instance types with corresponding weighted capacities and EC2 Spot bid prices. EMR will automatically provision On-Demand and Spot capacity across these instance types when creating your cluster. This can make it easier and more cost effective to quickly obtain and maintain your desired capacity for your clusters while leveraging Spot prices.

TellApart Saves 75% on Hadoop clusters with EC2 Spot

Read about Tellapart

TellApart’s big data platform enables retailers to unlock the power of their customer data. Learn how they use Amazon EMR to bring up Hadoop clusters to batch process log data, and have reduced costs by 75% by using Spot Instances.

Guide: Use EMR and Spot to save up to 90% off Big Data Workloads

Read EMR Spot Documentation

Read how easy it is to use Amazon EMR and EC2 Spot to to save money on big data workloads like Hadoop and Spark. 

Hands on Lab: Getting Started with EC2 Spot

Start Free Lab

Gain a basic understanding of using Amazon EC2 Spot with this free, guided hands-on lab covering the steps needed to create and connect to an Amazon EC2 Spot instance.

Gett runs its website and mobile app on several hundred Amazon EC2 instances, scaling its EC2 capacity up or down automatically based on user demand. Gett chose to reduce costs by taking advantage of Amazon EC2 Spot Instances, including use via EMR to help process huge amounts of data at a fraction of the cost.

Krux uses a combination of Apache Hadoop on Amazon EMR and Apache Spark to run machine learning jobs and extract/transform/load (ETL) workloads, with Amazon S3 as its core distributed storage system. Krux implemented the EMR infrastructure using Amazon EC2 Spot instances to gain access to compute functionality at reduced costs.

BloomReach has built a personalized discovery platform with applications for organic search, site search, content marketing and merchandizing. BloomReach describes their use of EMR and Spot via this blog and shares some best practices for maximum cost efficiency.

AWS re:Invent 2016: Learn How FINRA Aligns Billions of Ordered Events with Spark on EC2

FINRA is a leader in the Financial Services industry who sought to move toward real-time data insights of billions of time-ordered market events by migrating from SQL batch processes on-prem, to Apache Spark in the cloud. By using Apache Spark on Amazon EMR, FINRA can now test on realistic data from market downturns, enhancing their ability to provide investor protection and promote market integrity. By using EC2 Spot instances, FINRA has saved up to 50% from its on premises solution, increased elasticity/scalability, and accelerated reprocessing requests (from months to days).

How FINRA Aligns Billions of Ordered Events with Spark on EC2 Spot
Get started with EC2 Spot

It's easy to get started. Follow our Getting Started best practices to create your first Spot instances request with just a few clicks.