Auto Scaling helps you maintain application availability and allows you to dynamically scale your Amazon EC2 capacity up or down automatically according to conditions you define. You can use Auto Scaling for Fleet Management of EC2 instances to help maintain the health and availability of your fleet and ensure that you are running your desired number of Amazon EC2 instances. You can also use Auto Scaling for Dynamic Scaling of EC2 instances in order to automatically increase the number of Amazon EC2 instances during demand spikes to maintain performance and decrease capacity during lulls to reduce costs. Auto Scaling is well suited both to applications that have stable demand patterns or that experience hourly, daily, or weekly variability in usage. Beyond Auto Scaling for Amazon EC2, you can use Application Auto Scaling to automatically scale resources for other AWS services, including Amazon ECS, Amazon EC2 Spot Fleets, Amazon EMR Clusters, AppStream 2.0 fleets, and Amazon DynamoDB.

Get Started Now with Auto Scaling
VideoThumbnail

Get Started with AWS for Free

Create a Free Account
Or Sign In to the Console

Receive twelve months of access to the AWS Free Usage Tier and enjoy AWS Basic Support features including, 24x7x365 customer service, support forums, and more.


Whether you are running one Amazon EC2 instance or thousands, you can use Auto Scaling to detect impaired Amazon EC2 instances and unhealthy applications, and replace the instances without your intervention. This ensures that your application is getting the compute capacity that you expect. To automate fleet management for EC2 instances, Auto Scaling will perform three main functions, described here and in our blog, Fleet Management Made Easy with Auto Scaling.

  • Monitoring the health of running instances
    Auto Scaling ensures that your application is able to receive traffic and that the instances themselves are working properly. When Auto Scaling detects a failed health check, it can replace the instance automatically.
  • Automatically replacing impaired instances
    When an impaired instance fails a health check, Auto Scaling automatically terminates it and replaces it with a new one. That means that you don’t need to respond manually when an instance needs replacing.
  • Balancing capacity across Availability Zones
    Auto Scaling automatically balances EC2 instances across zones when multiple zones are configured, and always launches new instances so that they are balanced between zones as evenly as possible across your entire fleet.   

Auto Scaling enables you to follow the demand curve for your applications closely, reducing the need to manually provision Amazon EC2 capacity in advance. For example, you can use Target Tracking scaling policies to select a load metric for your application, such as CPU utilization. Or, you could set a target value using the new “Request Count Per Target” metric from Application Load Balancer, a load balancing option for the Elastic Load Balancing service. Auto Scaling will then automatically adjust the number of EC2 instances as needed to maintain your target. You can also use simple scaling policies to set a condition to add new Amazon EC2 instances in increments when the average utilization of your Amazon EC2 fleet is high, and similarly, you can set a condition to remove instances in the same increments when CPU utilization is low. If you have predictable load changes, you can also set a schedule through Auto Scaling to plan your scaling activities. Auto Scaling can also be used with Amazon CloudWatch, which can send alarms to trigger scaling activities, and Elastic Load Balancing to help distribute traffic to your instances within Auto Scaling groups.

AutoScaling
NASA JPL Discusses Dynamic Scaling at re:Invent 2017

With Application Auto Scaling, you can automatically scale resources for other AWS services beyond Amazon EC2. The experience similar to that of Auto Scaling. You can use Application Auto Scaling to define scaling policies to automatically scale your AWS resources, scale your resources in response to CloudWatch alarms and to view the history of your scaling events.

Application Auto Scaling can scale the following AWS resources:

  • Amazon ECS services: Your Amazon ECS service can optionally be configured to use Service Auto Scaling to adjust its desired count up or down in response to CloudWatch alarms. For more information, read our documentation.
  • Amazon EC2 Spot fleets: A Spot fleet can either launch instances (scale out) or terminate instances (scale in), within the range that you choose, in response to one or more scaling policies. For more details, see the documentation.
  • Amazon EMR clusters: Auto Scaling in Amazon EMR allows you to programmatically scale out and scale in core nodes and task nodes in a cluster based on rules that you specify in a scaling policy. For more, read our documentation.
  • AppStream 2.0 fleets: You can define scaling policies that adjust the size of your fleet automatically based on a variety of utilization metrics, and optimize the number of running instances to match user demand. You can also choose to turn off automatic scaling and make the fleet run at a fixed size. To learn more, see the documentation.
  • Amazon DynamoDB: You can dynamically adjust provisioned throughput capacity in response to actual traffic patterns. This enables a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic without throttling. When the workload decreases, Application Auto Scaling decreases the throughput so that you don't pay for unused provisioned capacity. For more details, refer to the documentation. You can also read our blog, Auto Scaling for Amazon DynamoDB.