AWS Compute Blog

An EC2 Spot Architecture for Web Applications

Tipu Qureshi Tipu Qureshi, AWS Senior Cloud Support Engineer

This blog post describes a reference architecture that utilizes Spot Instances and is meant to help enable you to realize additional cost savings for your stateless web tier while maintaining high availability. We recommend tailoring and testing it for your application before implementing it to a production environment.

Spot instances enable you to name your own price for Amazon EC2 computing capacity by simply bidding on unused Amazon EC2 instances, which can often lower your Amazon EC2 costs significantly, depending on your application. For example, Auto Scaling groups running on-demand instances can be placed together with Auto Scaling groups running Spot instances using different Spot bid prices behind the same Elastic Load Balancer to provide more flexibility and to meet changing traffic demands. Please see Launch Spot instances in Your Auto Scaling Group and Load Balance Your Auto Scaling Group for more details about using Spot instances and Auto Scaling groups with an Elastic Load Balancer.

Session state can be stored out of the web tier in a DynamoDB table. DynamoDB is a regional service, meaning that the data is automatically replicated across availability zones for fault tolerance. You can also choose other databases to maintain state in your architecture. The availability of Spot instances can vary depending on how many unused Amazon EC2 instances are available Because real-time supply and demand dictates the available supply of Spot instances as well as the number of instances your business demands, you should architect it to be resilient to instance termination. This includes when the Spot price exceeds the price you named (i.e. the bid price), at which time the instance will receive a two-minute warning that the instance will be terminated. You can manage this by creating IAM roles from the Spot instances that they use to de-register themselves from the ELB once they receive notification that they will be terminated. Details about Creating an IAM Role Using the AWS CLI can be found here and Spot Instance Termination Notices can be found here. The following diagram depicts what this would look like:


Spot Architecture for Web Apps

Spot Architecture for Web Apps

A script like the following can be placed in a loop and can be run on startup (e.g via systemd or rc.local) to detect Spot instance termination and then place any session information into DynamoDB as well as de-register itself from the ELB so that it will not receive any more requests. We recommend that interested applications poll for the termination notice at five-second intervals.

$ if curl -s | \
grep -q .*T.*Z; then instance_id=$(curl -s; \
aws elb deregister-instances-from-load-balancer \
  --load-balancer-name my-load-balancer \
  --instances $instance_id; /env/bin/; fi

An Auto Scaling group running on-demand instances in tandem with Auto Scaling groups running Spot instances behind the same Elastic Load Balancer will help to ensure your application’s availability in case of changes in Spot market price and Spot instance capacity. If the Spot instances in an Auto Scaling group terminate because the Spot price increases past the bid price, then the Auto Scaling group running on-demand instances will scale commensurately according to your defined scaling policy to adequately serve requests. For Auto Scaling to scale according to your application needs, you must define how you want to scale in response to changing conditions. You can assign more aggressive scaling policies to Auto Scaling groups that run Spot instances ( e.g. to scale up when instances reach 75% CPU utilization and scale down when they reach 25% CPU utilization with a large capacity range), and assign more conservative scaling policies to Auto Scaling groups that run on-demand instances. For information about using Amazon CloudWatch metrics to scale automatically, see Dynamic Scaling.

Please note that Elastic Load Balancers use the least outstanding request (for HTTP/HTTPS connections) routing algorithm which favors back-end instances with the fewest outstanding requests. Since we are working with multiple Auto Scaling groups spanning across multiple availability zones, we highly recommend enabling cross-zone load balancing for the load balancer. Cross-zone load balancing allows each load balancer node to route requests across multiple availability zones, ensuring that each zone receives an equal amount of request traffic. To allow in-flight requests to complete when de-registering Spot instances that are about to be terminated, connection draining can be enabled on the load balancer with a timeout of 90 seconds. Connection draining causes the ELB load balancer to stop sending new requests to a deregistering instance or an unhealthy instance, while keeping the existing connections open. For more details please see Enable or Disable Connection Draining for Your Load Balancer.