Amazon EC2 Auto Scaling with EC2 Spot Instances
In this tutorial you will learn how to create a stateless, fault tolerant workload using Amazon EC2 Auto Scaling with launch templates to request Amazon EC2 Spot Instances. Amazon EC2 Auto Scaling allows creation of collections of instances known as Amazon EC2 Auto Scaling groups (ASGs) and will maintain, scale-up, or scale-down the number of instances needed based on scaling policies that are defined based on the demands of that particular application. Amazon EC2 Spot Instances are spare capacity available at steep discounts compared to the EC2 On-Demand price, typically between 70% and 90% off. EC2 Spot Instances can be interrupted with a two-minute notification whenever spare capacity is no longer available. It is important that applications using EC2 Spot Instances be stateless, fault-tolerant, and loosely coupled. Instances should be ephemeral so they can be easily removed and replaced as needed. This will be accomplished by creating an AWS Application Load Balancer and using it in concert with Amazon EC2 Auto Scaling to add and remove instances as needed. The tutorial will also use launch templates and the capacity-optimized allocation strategy which will automatically provision EC2 Spot Instances from the most-available Spot Instance pools by analyzing capacity metrics.
This tutorial makes use of the Default VPC of your AWS account. If your account doesn’t have a Default VPC, you can use your own VPC. If you use your own VPC, the AWS Application Load Balancer that will be created must be configured to use public subnets, and the EC2 instances must be either in the same public subnets with public IP addresses auto-assigned, or in private subnets with a NAT Gateway configured to permit outbound internet access. It is also recommended that your VPC has a subnet mapped to each Availability Zone of the AWS region you are working in to help maximize the number of Spot capacity pools available to Amazon EC2 Auto Scaling.
1.1 — Log into the AWS Console and open the AWS Application Load Balancers console. We will create an AWS Application Load Balancer along with a new Security Group that our sample workload will make use of for the remainder of the tutorial. Click Create Load Balancer.
Load balancing will distribute http requests to the fleet of Amazon EC2 Spot Instances that you will create later in the tutorial.
1.2 — Select the Create button from Application Load Balancer.
1.3 — Name your load balancer, leave the scheme set to internet-facing and the address type as ipv4. The listener should be left as http and port 80. Choose the default VPC or another that meets the requirements for the this tutorial. Choose every possible Availability Zone from the list and ensure that only public subnets are selected. The number of subnets will vary depending on the Region and VPC configuration.
Click Next: Configure Security Settings
1.4 — We will not be transmitting any sensitive data during this tutorial. While the suggestion about improving security with HTTPS should be taken seriously with any production workload, we can move forward with HTTP only by clicking Next: Configure Security Groups.
1.5 — Select Create a new security group and take the pre-populated suggestion as well as name – this will make the Application Load Balancer reachable on TCP port 80/http from any Internet IP address. Make note of the security group name as this will be used in an upcoming step.
Click Next: Configure Routing
1.6 — Select New target group and give it a name. Leave the target type as Instance, the protocol as HTTP on port 80.
For Health Checks, leave the protocol as HTTP and the path as /
Expand the Advanced health check settings and change the Healthy threshold to 2 and the Interval to 10. Leave the other settings as they are. This will help speed up our testing later on.
Click Next: Register Targets
1.7 — No changes are needed on Step 5: Register Targets.
Click Next: Review
1.8 — After looking over Step 6: Review, click Create.
The Load Balancer will be created.
The browser tab can now be closed.
2.1 — Open the Amazon EC2 Auto Scaling console. Please look carefully at any banners and switch to the new EC2 Console using the provided banner links if you are still using the old console. Once you are in the new console, click Create Auto Scaling Group.
2.2 — A launch template includes parameters required to launch an Amazon EC2 instance such as the ID of the Amazon Machine Image (AMI) and an instance type. The Amazon EC2 Auto Scaling group requires either a launch template or launch configuration in order to be created. We will use a launch template because it offers improvements over using a launch configuration. To better understand the benefits launch templates offer you can read more here. Click on the Create Launch Template link near the bottom of the page. A new tab will open in your browser.
2.3 — Enter a Launch Template Name and Template Version Description for this template. You can provide your own name and description or use the naming in the example. Also note that the checkbox for Auto Scaling Guidance is checked and should remain so.
2.4 — Several items are required for this launch template. The first is the Amazon Machine Image (AMI). For this tutorial, select Amazon Linux 2 AMI (HVM), SSD Volume Type. (ensure you choose the x86 version, and not ARM).
Scroll down to Network Settings and select the security group that was created for the Application Load Balancer during Step 1. It will most likely be called load-balancer-wizard-1 unless it was named something differenty when it was created.
Take some time to review all of the options offered in a launch template. For this tutorial, all of the remaining items can be left with the default selection of Don’t include in launch template except the last one – Advanced details.
2.5 — Expand the Advanced details section and scroll to the bottom and locate the User data section. Do not select the Request Spot Instances checkbox – we will select and enable Spot Instances when setting up Auto Scaling in a later step.
Copy and paste the following text into the User data field:
#!/bin/bash yum install httpd -y systemctl start httpd systemctl stop firewalld cd /var/www/html echo "this is my test site and the instance-id is " > index.html curl http://169.254.169.254/latest/meta-data/instance-id >> index.html
This above code will create a simple web server that will show the instance-id when queried. This will help later as you are testing your Auto Scaling group.
Click Create launch template.
A confirmation screen will be shown with several options, including the option to Create an Auto Scaling group from your template. Do not select this option as an older version of the Auto Scaling console may be launched. This tutorial uses the latest version of the console. Close this browser tab, and continue to Step 3.
Important Note: The instances must be in a subnet that provides outbound Internet access – this can be public subnet that auto-assigns a public ipv4 address to the instance, or it can be a private subnet with a NAT Gateway configured. Without this, the web server cannot be downloaded and installed which will cause a 502 Bad Gateway issue with your Load Balancer as you test.
3.1 — Return to the original Create Auto Scaling group browser tab. Enter a name for the Auto Scaling group. Feel free to use your own name or enter the example name provided.
3.2 — Click the refresh button in the Launch template box and select the template created in Step 1. Once selected, additional detail about the template will be visible. Change the Version from Default (1) to Latest (1). This change means that whenever we update this launch template and a new version is created in the future, our Auto Scaling group will automatically begin to use that latest version on the very next instance it needs to launch. This is a powerful capability that makes, among other things, updating the Amazon Machine Image very easy, for example. Once that is completed, click Next.
3.3 — With the launch template selected, the specific instances to be used can be selected. Under Instances Distribution, both On-Demand Instances and Spot Instances will be mixed in the same Auto Scaling group – this is also known as a MixedInstancesPolicy. Further information is available here. Because the application is stateless, fault tolerant and can handle an instance being interrupted, mostly Spot Instances will be used, with a small number of On-Demand Instances to balance things out. Set the % On Demand to 10%. As this is done, the % Spot will automatically adjust to 90%. This can be adjusted as desired as long as mix of both are maintained.
Notice that there is also a section about the Spot allocation strategy per Availability Zone. The Capacity optimized allocation strategy has been selected automatically and is what is recommended for nearly all applications. Read more about this feature here.
3.4 — Selecting Instance types is an important part of following best practices when using EC2 Spot Instances with Auto Scaling groups. Since there will be both On-Demand and Spot in the same group – the order these instances are listed does matter – but only for On-Demand. This is known as a Prioritized List. In short – On-Demand gets launched in the order listed. If for some reason the first choice could not be launched in a particular Availability Zone, Auto Scaling would try the next choice, and so on until it was able to fulfill the On-Demand portion of the initial request. The order of the list does not matter for Spot however. Capacity Optimized will be used, and will look at each instance type and make a determination per Availability Zone as to what will be launched.
Select c5.large as the primary instance type. Additional instance types are automatically selected and default to Family and generation flexible. This means the latest generation of instances that have similar characteristics to the primary instance can be used – in this case 2vCPUs and at least 4Gib of RAM. For space reasons the entire list of 20 different instance types that are a good fit for our workload is not shown.
3.5 — Scroll down to the Network section.
Select the same VPC that the Load Balancer was created in, and choose either public subnets, or private subnets that have a NAT Gateway configured. Select as many subnets as possible for maximum instance diversity.
In this example a total of six Availability Zones (subnets) have been chosen which will provide access to 120 Spot Pools. This will reduce the impact of Spot interruptions because each pool is completely independent of any other pool.
Click Next to move on to Load Balancing
4.1 — Select the Enable Load Balancing checkbox. Leave Application Load Balancer or Network Lead Balancer selected. Choose the target group created for the load balancer in Step 1.
Under the Health Checks section, lower the Health Check Grace Period to 120 seconds – this will help speed up testing. Each launched instance will automatically install and start a simple web server and typically boot up quickly.
4.2 — Set the Desired Capacity to 12, the Minimum Capacity to 6, and the Maximum Capacity to 12. These settings set guard rails and keep our Auto Scaling group within them. The Desired Capacity is how many instances we will start with, but you will see this scale down to the 6 instance Minimum capacity because of the Scaling policy that will be set next.
Hint: Set the Minimum Capacity to be equal to the number of Availability Zones you selected in order to have Auto Scaling maintain at least one instance in each Availability Zone.
4.3 — Choose Target tracking scaling policy and leave the default settings of 50% Average CPU Utilization. This policy would scale out with more instances when the average CPU across all instances goes above 50% and scale in when it falls back below this. You should see this happen as it scales in from the Desired Capacity of 10 down to the Minimum Capacity of 6 once Auto Scaling determines that the CPUs are well below 50%.
As we won’t be setting any Instance scale-in protection, notifications, or tagging for this tutorial, you can click Skip to review.
4.4 — A screen showing a full summary of everything you just configured will be shown. Take a moment to review it both for accuracy, but also just to get a sense of everything you just configured. This is the very final step before EC2 Instances will be launched. Once you are satisfied, click Create Auto Scaling group.
4.5 — The Auto Scaling group will be created and a status screen will be shown with real-time updates.
5.1 — Click on the Name of your group in the Auto Scaling groups Console.
5.2 — Click on the Instance management tab – take a look at what was launched – the Instance type and Availability Zone. You should see an even balance across each Availability Zone. You may see a lot of the same Instance type or you may see a wide variety of different Instances types – this is capacity-optimized at work and the result of the decisions it has made in terms of which instance types to launch in each Availability Zone.
5.3 — Go back to the Load Balancing console here. Find the DNS name for the Load Balancer created in Step 1. Click on the copy icon next to the A record to load it into the clipboard and then paste it into a web browser.
5.4 — A simple website will be shown in the browser. Keep clicking refresh – each time you do this, the load balancer is sending your request to a different instance and you should see the instance-id changing with each request. This is also known as round-robin load balancing and is the default setting.
5.5 — Spend some time playing around with your Auto Scaling group. You can simulate Spot Interruptions by simply selecting some running instances from the EC2 Instances Console and terminating them. Your website will continue to work as long as you have some instances running. EC2 Auto Scaling will detect that the instances have failed or been terminated and will replace them with fresh instances and automatically add them back to the load balancer within a short period of time, while maintaining an even spread across all of the Availability Zones.
What happens if you terminate all but one instance? How about if you terminate all of them? Give it try and observe how Auto Scaling responds. This is meant to illustrate how Auto Scaling works to keep your application running whenever instances are no longer available for any reason (such as human error) – not just an EC2 Spot interruption. Because we are using multiple Availability Zones and EC2 Spot pools and following diversification best practices, having all our EC2 Spot Instances interrupted simultaneously isn’t a concern.
Hint: Click on the gear settings icon on the upper right – scroll down through the Instance Attributes and choose Lifecycle. Click Close. Note you can now see which instances are Spot easily by scrolling all the way to the right.
5.6 — Go back to the Auto Scaling console and review the Activity history to see all of the actions that have been taken, and why.
6.1 — Although the Amazon EC2 Spot Instances are typically 70% to 90% less than an On-Demand EC2 instance, it is still a good idea to clean up and delete resources once you have completed the tutorial. This is a quick and easy process.
6.2 — Delete the Auto Scaling group you created – this will automatically terminate any instances that it currently manages. Type delete when asked to confirm. Within a few moments, it will be fully deleted and will disappear from the console. Double check the Instances console to be sure that nothing is still running.
6.3 — From the Load Balancers console, delete the Load Balancer you created. It will disappear almost immediately once you click Yes, Delete.
6.4 — Navigate to the Target Groups console (right below the Load Balancer console) and delete the Target group you created. It will disappear immediately after you confirm with Yes, delete.
6.5 — Finally, open the Launch Templates console, located below the Instances console, and delete your launch template. Once you type Delete to confirm, the template will disappear.
6.6 — Congratulations – you have finished cleaning up and should no longer be incurring any costs.
Congratulations!! You just learned how to create an Auto Scaling group using EC2 Spot Instances with launch templates. Auto Scaling with load balancing will keep your stateless, fault-tolerant workload such as a web application running even after some instances are interrupted or experience other issues. You also learned how to see the actions taken by Auto Scaling.
Recommended next steps
Running web applications on Amazon EC2 Spot Instances
To learn more about using Spot Instances to run web applications, review this blog
Spot Instances workshops
Want to learn more about Spot Instances? Check out the self-guided Spot Instance workshops to learn more about additional use-cases for Spot Instances.
Explore Amazon EC2 Spot Instances
If you want to learn more about Amazon EC2 Spot Instances, visit the Amazon EC2 Spot Instances product page to explore documentation, videos, blogs, and more.