Posted On: Nov 24, 2021

With Amazon EC2 Auto Scaling’s new predictive scaling policy, you can now use custom metrics to predict the EC2 instance capacity needed by an Auto Scaling group. Predictive scaling proactively increases the capacity of an Auto Scaling group to meet predicted demand. For workloads that experience recurring, steep demand changes, predictive scaling can help improve your application’s responsiveness without having to overprovision capacity, resulting in lower EC2 costs. Custom metrics are useful when the predefined metrics (CPU Utilization, Network I/O, and ALB Request Count) are not sufficient to capture the load on your application. Previously, you could only use custom metrics with step scaling and target tracking, but you can now use them with predictive scaling as well.

For example, predictive scaling can now be configured to scale based on an Amazon CloudWatch metric from another AWS service that represents your application’s load—like the number of messages in an Amazon Simple Queue Service (SQS) queue—or based on a custom CloudWatch metric specific to your application—like the number of user sessions served. Predictive scaling now also supports CloudWatch Metric Math Expressions, enabling you to easily create custom metrics from existing ones. For example, if the Auto Scaling group processes tasks from multiple SQS queues, you can create a custom metric that represents the total messages across queues by using a simple SUM expression, saving you the effort and cost of creating another CloudWatch metric. You can also use Metric Math expressions to aggregate metrics across Auto Scaling groups, for example in Blue-Green deployment scenarios.

Predictive scaling is available through AWS Command Line Interface (CLI), EC2 Auto Scaling Management Console, and AWS SDKs in all public AWS Regions. To learn more, visit the predictive scaling page in the EC2 Auto Scaling documentation.