Q: What is Auto Scaling?
Auto Scaling is a fully managed service designed to launch or terminate Amazon EC2 instances automatically to help ensure you have the correct number of Amazon EC2 instances available to handle the load for your application. Auto Scaling helps you maintain application availability through fleet management for EC2 instances, which detects and replaces unhealthy instances, and by scaling your Amazon EC2 capacity up or down automatically according to conditions you define. You can use Auto Scaling to automatically increase the number of Amazon EC2 instances during demand spikes to maintain performance and decrease capacity during lulls to reduce costs.
Q: What are the benefits of using Auto Scaling?
Auto Scaling helps to maintain your Amazon EC2 instance availability. Whether you are running one Amazon EC2 instance or thousands, you can use Auto Scaling to detect impaired Amazon EC2 instances, and replace the instances without intervention. This ensures that your application has the compute capacity that you expect.
You can use Auto Scaling to automatically scale your Amazon EC2 fleet by following the demand curve for your applications, reducing the need to manually provision Amazon EC2 capacity in advance. For example, you can set a condition to add new Amazon EC2 instances in increments to the Auto Scaling group when the average utilization of your Amazon EC2 fleet is high; and similarly, you can set a condition to remove instances in increments when CPU utilization is low. You can also use Amazon CloudWatch to send alarms to trigger scaling activities and Elastic Load Balancing (ELB) to distribute traffic to your instances within Auto Scaling groups. If you have predictable load changes, you can set a schedule through Auto Scaling to plan your scaling activities. Auto Scaling enables you to run your Amazon EC2 fleet at optimal utilization.
Q: What is fleet management and how is it different from dynamic scaling?
If your application runs on Amazon EC2 instances, then you have what’s referred to as a ‘fleet’. Fleet management refers to the functionality that automatically replaces unhealthy instances and maintains your fleet at the desired capacity. Auto Scaling fleet management ensures that your application is able to receive traffic and that the instances themselves are working properly. When Auto Scaling detects a failed health check, it can replace the instance automatically.
The dynamic scaling capabilities of Auto Scaling refers to the functionality that automatically increases or decreases capacity based on load or other metrics. For example, if your CPU spikes above 80% (and you have an alarm setup) Auto Scaling can add a new instance dynamically.
Q: What is an Auto Scaling group?
An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of fleet management and dynamic scaling. For example, if a single application operates across multiple instances, you might want to increase the number of instances in that group to improve the performance of the application, or decrease the number of instances to reduce costs when demand is low. Auto Scaling will automaticallly adjust the number of instances in the group to maintain a fixed number of instances even if a instance becomes unhealthy, or based on criteria that you specify. You can find more information about Auto Scaling groups in the Auto Scaling user guide.
Q: How do I know when Auto Scaling is launching or terminating the EC2 instances in an Auto Scaling group?
When you use Auto Scaling to scale your applications automatically, it is useful to know when Auto Scaling is launching or terminating the EC2 instances in your Auto Scaling group. Amazon SNS coordinates and manages the delivery or sending of notifications to subscribing clients or endpoints. You can configure Auto Scaling to send an SNS notification whenever your Auto Scaling group scales.
Amazon SNS can deliver notifications as HTTP or HTTPS POST, email (SMTP, either plain-text or in JSON format), or as a message posted to an Amazon SQS queue. For example, if you configure your Auto Scaling group to use the autoscaling: EC2_INSTANCE_TERMINATE notification type, and your Auto Scaling group terminates an instance, it sends an email notification. This email contains the details of the terminated instance, such as the instance ID and the reason that the instance was terminated.
For more information visit Getting SNS Notifications when you Auto Scaling Group Scales.
Q: What is a Launch Configuration?
A launch configuration is a template that an Auto Scaling group uses to launch EC2 instances. When you create a launch configuration, you specify information for the instances such as the ID of the Amazon Machine Image (AMI), the instance type, a key pair, one or more security groups, and a block device mapping. If you've launched an EC2 instance before, you specified the same information in order to launch the instance.
When you create an Auto Scaling group, you must specify a launch configuration. You can specify your launch configuration with multiple Auto Scaling groups. However, you can only specify one launch configuration for an Auto Scaling group at a time, and you can't modify a launch configuration after you've created it. Therefore, if you want to change the launch configuration for your Auto Scaling group, you must create a launch configuration and then update your Auto Scaling group with the new launch configuration. When you change the launch configuration for your Auto Scaling group, any new instances are launched using the new configuration parameters, but existing instances are not affected. You can visit launch configurations in the Auto Scaling user guide for more details.
Q: How many instances can an Auto Scaling group have?
You can have as many instances in your Auto Scaling group as your EC2 quota allows.
Q: Can Auto Scaling groups span multiple AWS regions?
Auto Scaling groups are regional constructs. They can span Availability Zones, but not AWS regions.
Q: Can I launch different types of EC2 instances in same Auto Scaling group?
Auto Scaling groups optimize for the case when all your instance types are the same. You can use the AttachInstances API to attach instances of different types to an Auto Scaling group, and you can also update your launch configuration so that any new instances in the group will be launched with a different instance type. (However, this will not affect any of the existing instances.)
Q: How can I implement changes across multiple instances in an Auto Scaling group?
You can use AWS CodeDeploy or CloudFormation to orchestrate code changes to multiple instances in your Auto Scaling group.
Q: If I have data installed in an EC2 Auto Scaling group, and a new instance is dynamically created later, is the data copied over to the new instances?
Data is not automatically copied from existing instances to new instances. You can use lifecycle hooks to copy the data, or an Amazon RDS database including replicas.
Q: When I create an AutoScaling group from an existing instance, does it create a new AMI (Amazon Machine Image)?
When you create an Auto Scaling group from an existing instance, it does not create a new AMI. For more information see Creating an Auto Scaling Group Using an EC2 Instance.
Q: How does Auto Scaling balance capacity?
Balancing resources across Availability Zones is a best practice for well-architected applications, as this greatly increases aggregate system availability. Auto Scaling automatically balances EC2 instances across zones when you configure multiple zones in your Auto Scaling group settings. Auto Scaling always launches new instances such that they are balanced between zones as evenly as possible across the entire fleet. What’s more, Auto Scaling only launches into Availability Zones in which there is available capacity for the requested instance type.
Q: What are Lifecycle Hooks?
Lifecycle hooks let you take action before an instance goes into service or before it gets terminated. This can be especially useful if you are not baking your software environment into an Amazon Machine Image (AMI). For example, launch hooks can perform software configuration on an instance to ensure that it’s fully prepared to handle traffic before Auto Scaling proceeds to connect it to your load balancer. One way to do this is by connecting the launch hook to an AWS Lambda function that invokes RunCommand on the instance.
Terminate hooks can be useful for collecting important data from an instance before it goes away. For example, you could use a terminate hook to preserve your fleet’s log files by copying them to an Amazon S3 bucket when instances go out of service.
Visit lifecycle hooks in our Auto Scaling user guide for more information.
Q: What are the characteristics of an “unhealthy” instance?
An unhealthy instance is one where the hardware has become impaired for some reason (bad disk, etc.), or it is not passing a user-configured ELB health check. Auto Scaling performs health checks on each individual EC2 instance at regular intervals, and if the instance is connected to an Elastic Load Balancing load balancer, it can also perform ELB health checks.
Q: Can I customize a health check?
Yes, there is an API called SetInstanceHealth that allows you to change an instance's state to UNHEALTHY, which will then result in a termination and replacement.
Q: Can I suspend health checks (for example, to evaluate unhealthy instances)?
Yes, you can temporarily suspend Auto Scaling health checks by using the SuspendProcesses API. You can use the ResumeProcesses API to resume automatic health checks.
Q: Which health check type should I select?
If you are using Elastic Load Balancing (ELB) with your group, you should select an ELB health check. If you’re not using ELB with your group, you should select the EC2 health check.
Q: Can I use Auto Scaling for health checks and to replace unhealthy instances if I’m not using Elastic Load Balancing (ELB)?
You don't have to use ELB to use Auto Scaling. You can use the EC2 health check to identify and replace unhealthy instances.
Q: Do the Elastic Load Balancing (ELB) health checks work with application load balancers? Will an instance be marked as unhealthy if any target group associated with it becomes unhealthy?
Yes, Auto Scaling works with Application Load Balancers including its health check feature.
Q: Is there any way to use Auto Scaling to only add a volume without adding an instance?
A volume is attached to a new instance when it is added. Auto Scaling doesn't automatically add a volume when the existing one is approaching capacity. You can use the EC2 API to add a volume to an existing instance.
Q: What does the term “stateful instances” refer to?
When we refer to a stateful instance, we mean an instance that has data on it, which exists only on that instance. In general, terminating a stateful instance means that the data (or state information) on the instance is lost. You may want to consider using lifecycle hooks to copy the data off of a stateful instance before it’s terminated, or enable instance protection to prevent Auto Scaling from terminating it.
Q: My EC2 instances are created with Ansible scripts. How do I use Ansible with Auto Scaling?
You can find out details about using Ansible with Auto Scaling on the Ansible website.
Q: How does Auto Scaling replace an impaired instance?
When an impaired instance fails a health check, Auto Scaling automatically terminates it and replaces it with a new one. If you’re using an Elastic Load Balancing load balancer, Auto Scaling gracefully detaches the impaired instance from the load balancer before provisioning a new one and attaching it to the load balancer. This is all done automatically, so you don’t need to respond manually when an instance needs replacing.
Q: How do I control which instances Auto Scaling terminates when scaling in, and how do protect data on an instance?
With each Auto Scaling group, you control when Auto Scaling adds instances (referred to as scaling out) or remove instances (referred to as scaling in) from your group. You can scale the size of your group manually by attaching and detaching instances, or you can automate the process through the use of a scaling policy. When you have Auto Scaling automatically scale in, you must decide which instances Auto Scaling should terminate first. You can configure this through the use of a termination policy. You can also use instance protection to prevent Auto Scaling from selecting specific instances for termination when scaling in.
If you have data on an instance, and you need that data to be persistent even if your instance is scaled in, then you can use a service like S3, RDS, or DynamoDB, to make sure that it is stored off the instance.
Q: How long is the turn-around time for Auto Scaling to spin up a new instance at inService state after detecting an unhealthy server?
The turnaround time is within minutes. The majority of replacements happen within less than 5 minutes, and on average it is significantly less than 5 minutes. It depends on a variety of factors, including how long it takes to boot up the AMI of your instance.
Q: If Elastic Load Balancing (ELB) determines that an instance is unhealthy, and moved offline, will the previous requests sent to the failed instance be queued and rerouted to other instances within the group?
When ELB notices that the instance is unhealthy, it will stop routing requests to it. However, prior to discovering that the instance is unhealthy, some requests to that instance will fail.
Q: If you don’t use Elastic Load Balancing (ELB) how would users be directed to the other servers in a group if there was a failure?
You can integrate with Route53 (which Auto Scaling does not currently support out of the box, but many customers use). You can also use your own reverse proxy, or for internal microservices, can use service discovery solutions.
Q: What is Application Auto Scaling
With Application Auto Scaling, you can automatically scale your AWS resources. The experience similar to that of Auto Scaling. You can use Application Auto Scaling to accomplish the following tasks:
- Define scaling policies to automatically scale your AWS resources
- Scale your resources in response to CloudWatch alarms
- View the history of your scaling events
Application Auto Scaling can scale the following AWS resources:
- Amazon ECS services. For more information, see Service Auto Scaling in the Amazon EC2 Container Service Developer Guide.
- Amazon EC2 Spot fleets. For more information, see Automatic Scaling for Spot Fleet in the Amazon EC2 User Guide.
- Amazon EMR clusters. For more information, see Using Automatic Scaling in Amazon EMR in the Amazon EMR Management Guide.
- AppStream 2.0 fleets. For more information, see Autoscaling Amazon AppStream 2.0 Resources in the Amazon AppStream 2.0 Developer Guide.
For a list of supported regions, see AWS Regions and Endpoints: Application Auto Scaling in the AWS General Reference.
Q: How do I get started with Auto Scaling?
The easiest way to get started with Auto Scaling is to build a fleet from existing instances. The AWS Management Console provides a simple workflow to do this: right-click on a running instance and choose Instance Settings, Attach to Auto Scaling Group.You can then opt to attach the instance to a new Auto Scaling group. Your instance is now being automatically monitored for health and will be replaced if it becomes impaired. If you configure additional zones and add more instances, they will be spread evenly across Availability Zones to make your fleet more resilient to unexpected failures.
Q: How do I create an Auto Scaling group?
You can find a tutorial for getting started with creating an Auto Scaling group in the Auto Scaling user guide.
- Create users and groups under your organization's AWS account
- Assign unique security credentials to each user under your AWS account
- Control each user's permissions to perform tasks using AWS resources
- Allow the users in another AWS account to share your AWS resources
- Create roles for your AWS account and define the users or services that can assume them
- Use existing identities for your enterprise to grant permissions to perform tasks using AWS resources
For example, you could create an IAM policy that grants the Managers group permission to use only the DescribeAutoScalingGroups, DescribeLaunchConfigurations, DescribeScalingActivities, and DescribePolicies API operations. Users in the Managers group could then use those operations with any Auto Scaling groups and launch configurations. Note that you can't restrict access to a particular Auto Scaling group or launch configuration.
Q: Can you define a default admin password on Windows instances with Auto Scaling?
You can use the Key Name parameter to CreateLaunchConfiguration to associate a key pair with your instance. You can then use the GetPasswordData API in EC2. This is also possible through the AWS Management Console.
Q: Are CloudWatch agents automatically installed on EC2 instances when you create an Auto Scaling group?
If your AMI contains a CloudWatch agent, it’s automatically installed on EC2 instances when you create an Auto Scaling group. With the stock Amazon Linux AMI, you need to install it (recommended, via yum).
Q: What are the costs for using Auto Scaling?
Auto Scaling fleet managment for EC2 instances carries no additional fees. The dynamic scaling capabilities of Auto Scaling are enabled by Amazon CloudWatch and also carry no additional fees. Amazon EC2 and Amazon CloudWatch service fees apply and are billed separately. Partial hours are billed as full hours.