Amazon ECS Cluster Auto Scaling is Now Generally Available

Today, we have launched Amazon ECS Cluster Auto Scaling. This new capability improves your cluster scaling experience by increasing the speed and reliability of cluster scale-out, giving you control over the amount of spare capacity maintained in your cluster, and automatically managing instance termination on cluster scale-in.

To enable ECS Cluster Auto Scaling, you will need to create a new ECS resource type called a Capacity Provider. A Capacity Provider can be associated with an EC2 Auto Scaling Group (ASG). When you associate an ECS Capacity Provider with an ASG and add the Capacity Provider to an ECS cluster, the cluster can now scale your ASG automatically by using two new features of ECS:

Managed scaling, with an automatically-created scaling policy on your ASG, and a new scaling metric (Capacity Provider Reservation) that the scaling policy uses; and
Managed instance termination protection, which enables container-aware termination of instances in the ASG when scale-in happens.

These new features will give customers greater control of when and how Amazon Elastic Container Service (Amazon ECS) clusters scale-in and scale-out.

Capacity Provider Reservation
The new metric, called capacity provider reservation, measures the total percentage of cluster resources needed by all Amazon ECS workloads in the cluster, including existing workloads, new workloads, and changes in workload size. This metric enables the scaling policy to scale out quicker and more reliably than it could when using CPU or memory reservation metrics. Customers can also use this metric to reserve spare capacity in their clusters. Reserving spare capacity allows customers to run more containers immediately if needed, without waiting for new instances to start.

Managed Instance Termination Protection
With instance termination protection, Amazon ECS controls which instances the scaling policy is allowed to terminate on scale-in, to minimize disruptions of running containers. These improvements help customers achieve lower operational costs and higher availability of their container workloads running on Amazon ECS.

How This Help Customers
Customers running scalable container workloads on Amazon ECS often use metric-based scaling policies to automatically scale their Amazon ECS clusters. These scaling policies use generic metrics such as average cluster CPU and memory reservation percentages to determine when the policy should add or remove cluster instances.

Clusters running a single workload, or workloads that scale-out slowly, often work well with such policies. However, customers running multiple workloads in the same cluster, or workloads that scale-out rapidly, are more likely to experience problems with cluster scaling. Ideally, increases in workload size that cannot be accommodated by the current cluster should trigger the policy to scale the cluster out to a larger size.

Because the existing metrics are not container-specific and account only for resources already in use, this may happen slowly or be unreliable. Furthermore, because the scaling policy does not know where containers are running in the cluster, it can unnecessarily terminate containers when scaling in. These issues can reduce the availability of container workloads. Mitigations such as over-provisioning, custom tooling, or manual intervention often impose high operational costs.

Enough Talk, Let’s Scale
To understand these new features more clearly, I think it’s helpful to work through an example.

Amazon ECS Cluster Auto Scaling can be set up and configured using the AWS Management Console, AWS CLI, or Amazon ECS API. I’m going to open up my terminal and create a cluster.

Firstly, I create two files. The first file is called demo-launchconfig.json and defines the instance configuration for the Amazon Elastic Compute Cloud (Amazon EC2) instances that will make up my auto scaling group.

{
    "LaunchConfigurationName": "demo-launchconfig",
    "ImageId": "ami-01f07b3fa86406c96",
    "SecurityGroups": [
        "sg-0fa5be8c3749f3aa0"
    ],
    "InstanceType": "t2.micro",
    "BlockDeviceMappings": [
        {
            "DeviceName": "/dev/xvdcz",
            "Ebs": {
                "VolumeSize": 22,
                "VolumeType": "gp2",
                "DeleteOnTermination": true,
                "Encrypted": true
                }
        }
    ],
    "InstanceMonitoring": {
        "Enabled": false
    },
    "IamInstanceProfile": "arn:aws:iam::365489315573:role/ecsInstanceRole",
    "AssociatePublicIpAddress": true
}

The second file is demo-userdata.txt, and it contains the user data that will be added to each EC2 instance. The ECS_CLUSTER name included in the file must be the same as the name of the cluster we are going to create. In my case, the name is demo-news-blog-scale.

#!/bin/bash
echo ECS_CLUSTER=demo-news-blog-scale >> /etc/ecs/ecs.config

Using the create-launch-configuration command, I pass the two files I created as inputs, this will create the launch configuration that I will use in my auto scaling group.

aws autoscaling create-launch-configuration --cli-input-json file://demo-launchconfig.json --user-data file://demo-userdata.txt

Next, I create a file called demo-asgconfig.json and define my requirements.

{
    "LaunchConfigurationName": "demo-launchconfig", 
    "MinSize": 0,
    "MaxSize": 100,
    "DesiredCapacity": 0,
    "DefaultCooldown": 300,
    "AvailabilityZones": [ 
        "ap-southeast-1c" ], 
    "HealthCheckType": "EC2", 
    "HealthCheckGracePeriod": 300, 
    "VPCZoneIdentifier": "subnet-abcd1234", 
    "TerminationPolicies": [ 
        "DEFAULT" 
    ],
    "NewInstancesProtectedFromScaleIn": true, 
    "ServiceLinkedRoleARN": "arn:aws:iam::111122223333:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling"
}

I then use the create-auto-scaling-group command to create an auto scaling group called demo-asg using the above file as an input.

aws autoscaling create-auto-scaling-group --auto-scaling-group-name demo-asg --cli-input-json file://demo-asgconfig.json

I am now ready to create a capacity provider. I create a file called demo-capacityprovider.json, importantly, I set the managedTerminationProtection property to ENABLED.

{
    "name": "demo-capacityprovider", "autoScalingGroupProvider": {
    "autoScalingGroupArn": "arn:aws:autoscaling:ap-southeast-1:365489315573:autoScalingGroup:e9c2f0c4-9a4c-428e-b81e-b22411a52954:autoScalingGroupName/demo-ASG",
            "managedScaling": {
                "status": "ENABLED",
                "targetCapacity": 100,
                "minimumScalingStepSize": 1,
                "maximumScalingStepSize": 100
            },
            "managedTerminationProtection": "ENABLED"
    }
}

I then use the new create-capacity-provider command to create a provider using the file as an input.

aws ecs create-capacity-provider --cli-input-json file://demo-capacityprovider.json

Now all the components have been created, I can finally create a cluster. I add the capacity provider and set the default capacity provider for the cluster as demo-capacityprovider.

aws ecs create-cluster --cluster-name demo-news-blog-scale --capacity-providers demo-capacityprovider --default-capacity-provider-strategy capacityProvider=demo-capacityprovider,weight=1

I now need to wait until the cluster has moved into the active state. I use the following command to get details about the cluster.

aws ecs describe-clusters --clusters demo-news-blog-scale --include ATTACHMENTS

Now that my cluster is set up, I can register some tasks. Firstly I will need to create a task definition. Below is a file I. have created called demo-sleep-taskdef.json. All this definition does is define a container that sleeps for infinity.

{
    "family": "demo-sleep-taskdef",
    "containerDefinitions": [
        {
            "name": "sleep",
            "image": "amazonlinux:2",
            "memory": 20,
            "essential": true,
            "command": [
                "sh",
                "-c",
                "sleep infinity"] 
        }],
    "requiresCompatibilities": [
        "EC2"] 
}

I then register the task definition using the register-task-definition command.

aws ecs register-task-definition --cli-input-json file://demo-sleep-taskdef.json

Finally, I can create my tasks. In this case, I have created 5 tasks based on the demo-sleep-taskdef:1 definition that I just registered.

aws ecs run-task --cluster demo-news-blog-scale --count 5 --task-definition demo-sleep-taskdef:1

Now because instances are not yet available to run the tasks, the tasks go into a provisioning state, which means they are waiting for capacity to become available. The capacity provider I configured will now scale-out the auto scaling group so that instances start up and join the cluster – at which point the tasks get placed on the instances. This gives a true “scale from zero” capability, which did not previously exist.

Things To Know
Amazon ECS Cluster Auto Scaling is now available in all regions where Amazon Elastic Container Service (Amazon ECS) and AWS Auto Scaling are available – check the region table for the latest list.

Happy Scaling!

— Martin

AWS News Blog

Amazon ECS Cluster Auto Scaling is Now Generally Available

Resources

Follow

Learn

Resources

Developers

Help