AWS Compute Blog

Fleet Management Made Easy with Auto Scaling

If your application runs on Amazon EC2 instances, then you have what’s referred to as a ‘fleet’. This is true even if your fleet is just a single instance. Automating how your fleet is managed can have big pay-offs, both for operational efficiency and for maintaining the availability of the application that it serves. You can automate the management of your fleet with Auto Scaling, and the best part is how easy it is to set up!

There are three main functions that Auto Scaling performs to automate fleet management for EC2 instances:

  • Monitoring the health of running instances
  • Automatically replacing impaired instances
  • Balancing capacity across Availability Zones

In this post, we describe how Auto Scaling performs each of these functions, provide an example of how easy it is to get started, and outline how to learn more about Auto Scaling.

Monitoring the health of running instances

Auto Scaling monitors the health of all instances that are placed within an Auto Scaling group. Auto Scaling performs EC2 health checks at regular intervals, and if the instance is connected to an Elastic Load Balancing load balancer, it can also perform ELB health checks. Auto Scaling ensures that your application is able to receive traffic and that the instances themselves are working properly. When Auto Scaling detects a failed health check, it can replace the instance automatically.

Automatically replacing impaired instances

When an impaired instance fails a health check, Auto Scaling automatically terminates it and replaces it with a new one. If you’re using an Elastic Load Balancing load balancer, Auto Scaling gracefully detaches the impaired instance from the load balancer before provisioning a new one and attaches it back to the load balancer. This is all done automatically, so you don’t need to respond manually when an instance needs replacing.

Balancing capacity across Availability Zones

Balancing resources across Availability Zones is a best practice for well-architected applications, as this greatly increases aggregate system availability. Auto Scaling automatically balances EC2 instances across zones when you configure multiple zones in your Auto Scaling group settings. Auto Scaling always launches new instances such that they are balanced between zones as evenly as possible across the entire fleet. What’s more, Auto Scaling only launches into Availability Zones in which there is available capacity for the requested instance type.

Getting started is easy!

The easiest way to get started with Auto Scaling is to build a fleet from existing instances. The AWS Management Console provides a simple workflow to do this: right-click on a running instance and choose Instance Settings, Attach to Auto Scaling Group.

You can then opt to attach the instance to a new Auto Scaling group. Your instance is now being automatically monitored for health and will be replaced if it becomes impaired. If you configure additional zones and add more instances, they will be spread evenly across Availability Zones to make your fleet more resilient to unexpected failures.

Diving deeper

While this example is a good starting point, you may want to dive deeper into how Auto Scaling can automate the management of your EC2 instances.

The first thing to explore is how to automate software deployments. AWS Elastic Beanstalk is a popular and easy-to-use solution that works well for web applications. AWS CodeDeploy is a good solution for fine-grained control over the deployment process. If your application is based on containers, then Amazon EC2 Container Service (Amazon ECS) is something to consider. You may also want to look into AWS Partner solutions such as Ansible and Puppet. One common strategy for deploying software across a production fleet without incurring downtime is blue/green deployments, to which Auto Scaling is particularly well-suited.

These solutions are all enhanced by the core fleet management capabilities in Auto Scaling. You can also use the API or CLI to roll your own automation solution based on Auto Scaling. The following learning path will help you to explore the service in more detail.

  • Launch configurations
  • Lifecycle hooks
  • Fleet size
  • Automatic process control
  • Scheduled scaling

Launch configurations

Launch configurations are the key to how Auto Scaling launches instances. Whenever an Auto Scaling group launches a new instance, it uses the currently associated launch configuration as a template for the launch. In the example above, Auto Scaling automatically created a launch configuration by deriving it from the attached instance. In many cases, however, you create your own launch configuration. For example, if your software environment is baked into an Amazon Machine Image (AMI), then your launch configuration points to the version that you want Auto Scaling to deploy onto new instances.

Lifecycle hooks

Lifecycle hooks let you take action before an instance goes into service or before it gets terminated. This can be especially useful if you are not baking your software environment into an AMI. For example, launch hooks can perform software configuration on an instance to ensure that it’s fully prepared to handle traffic before Auto Scaling proceeds to connect it to your load balancer. One way to do this is by connecting the launch hook to an AWS Lambda function that invokes RunCommand on the instance.

Terminate hooks can be useful for collecting important data from an instance before it goes away. For example, you could use a terminate hook to preserve your fleet’s log files by copying them to an Amazon S3 bucket when instances go out of service.

Fleet size

You control the size of your fleet using the minimum, desired, and maximum capacity attributes of an Auto Scaling group. Auto Scaling automatically launches or terminates instances to keep the group at the desired capacity. As mentioned before, Auto Scaling uses the launch configuration as a template for launching new instances in order to meet the desired capacity, doing so such that they are balanced across configured Availability Zones.

Automatic process control

You can control the behavior of Auto Scaling’s automatic processes such as health checks, launches, and terminations. You may find the AZRebalance process of particular interest. By default, Auto Scaling automatically terminates instances from one zone and re-launches them into another if the instances in the fleet are not spread out in a balanced manner.
You may want to disable this behavior under certain conditions. For example, if you’re attaching existing instances to an Auto Scaling group, you may not want them terminated and re-launched right away if that is required to re-balance your zones. Note that Auto Scaling always replaces impaired instances with launches that are balanced across zones, regardless of this setting. You can also control how Auto Scaling performs health checks, launches, terminations, and more.

Scheduled scaling

Scheduled scaling is a simple tool for adjusting the size of your fleet on a schedule. For example, you can add more or fewer instances to your fleet at different times of the day to handle changing customer traffic patterns. A more advanced tool is dynamic scaling, which adjusts the size of your fleet based on Amazon CloudWatch metrics.


Auto Scaling can bestow important benefits to cloud applications by automating the management of fleets of EC2 instances. Auto Scaling makes it easy to monitor instance health, automatically replace impaired instances, and spread capacity across multiple Availability Zones.

If you already have a fleet of EC2 instances, then it’s easy to get started with Auto Scaling in just a few clicks. After your first Auto Scaling group is working to safeguard your existing fleet, you can follow the suggested learning path in this post. Over time, you can explore more features of Auto Scaling and further automate your software deployments and application scaling.

If you have questions or suggestions, please comment below.