Scale your Amazon ECS using different AWS native services!

Containers accelerate application development and enhance deployment consistency across environments, thus enabling organizations to improve productivity and agility. AWS container services such as Amazon Elastic Container Service (Amazon ECS) make it easier to manage your application so you can focus on innovation and your business needs.

Customer experience is the most important yardstick by which organizations measure the application performance. Maintaining a reliable and consistent end-user experience in the face of varying request patterns is a challenge that every organization faces. As a best practice, you can use automatic scaling to increase (also known as scale-out) or decrease (also known as scale-in) the desired count of tasks in your Amazon ECS service automatically. This eliminates the need to respond manually in real-time to traffic spikes. Automatic scaling optimizes your use and cost efficiencies when consuming AWS services so that you only pay for the resources you actually need.

AWS provides several features that can be leveraged for automatic scaling of your Amazon ECS service. Using the right option improves the overall reliability of your application, reduces operational costs and complexity, and enhances the end-user experience. In this post, you learn about AWS Application Auto Scaling by which you can configure automatic scaling of your Amazon ECS service. You also learn to use Amazon ECS Service Connect and AWS Distro for OpenTelemetry (ADOT) that could be used in application auto scaling.

Application Auto Scaling

Service auto scaling enables you to increase or decrease the desired count of tasks in your Amazon ECS service automatically. Amazon ECS leverages the Application Auto Scaling service to provide this functionality. By default, Amazon ECS publishes CPU and memory usage to Amazon CloudWatch. You can use these or other custom CloudWatch metrics to scale your service. Amazon ECS Service Auto Scaling supports the following types of automatic scaling:

Target tracking scaling policies – You can increase or decrease the number of tasks that your service runs based on a target value for a specific metric. For example, you can define a scaling policy to add more tasks to the service if the CPU utilization exceeds a target value of 75%, and remove tasks from the service, if the CPU utilization falls below the target value of 75%.In some cases, you want to scale based on application specific metrics, such as the number of requests served, or based on metrics published by other AWS services. In this case, you can use arithmetic operations and mathematical functions to customize the metrics used with Target tracking policies. Refer this post to learn more about autoscaling based on custom metrics with Application Auto Scaling. The following screenshot illustrates a sample target tracking scaling policy.

Sample target tracking scaling policy

Step scaling policies – You can increase or decrease the number of tasks that your service runs based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach. With step scaling, you can choose metrics thresholds and the amount of resources to add or remove. For example, you can define a scaling policy that scales out your service by two tasks if the CPU utilization is between 50% and 75%, and scales in your service by two tasks if the CPU utilization drops below 25%. The following screenshot illustrates a sample step scaling policy.

Sample step scaling policy

Scheduled Scaling – You can increase or decrease the number of tasks that your service runs based on the date and time. For example, you can set a schedule to increase the number of tasks running in the service during peak traffic hours and decrease the number of tasks during off-peak hours. You can configure schedule scaling using the AWS Command Line Interface (AWS CLI).You can use scheduled scaling and scaling policies together to get the benefits of proactive and reactive approaches to scaling. After a scheduled scaling action runs, the scaling policy can continue to make decisions about whether to further scale capacity. This helps you make sure that you have sufficient capacity to handle the load for your application.

Service auto scaling is the easiest mode to scale your Amazon ECS service and you can configure the scaling using AWS Management Console, AWS CLI, and AWS SDKs.

The scaling policies support a cooldown period, which is the number of seconds to wait for a scaling activity to take effect. During a scale-out event, the tasks are continuously increased, whereas during a scale-in event, the tasks are conservatively decreased. To protect the availability of your application, scale-in activities are blocked until the cooldown period expires.

The following diagram illustrates a sample target tracking scaling policy.

Target tracking scaling policy

Pricing for Application Auto Scaling

There is no additional charge to use this feature. You pay for AWS resources that you create to store and run your application. You only pay for what you use, as you use it, and there are no minimum fees.

Amazon ECS Service Connect

Amazon ECS Service Connect provides a seamless service-to-service connectivity and rich traffic telemetry out of the box, with no changes to your application code. It does this by building both service discovery and a service mesh in Amazon ECS. When using Service Connect configuration, Amazon ECS automatically injects a sidecar proxy container to each new task as it is started. Amazon ECS manages this configuration for you.

If you have low latency requirements and need to monitor metrics such as the number of new clients, inbound requests, error rate, or the number of failed TLS connections, then we recommend configuring ECS Service Connect. ECS Service Connect uses a centralized service mesh architecture for traffic routing and management. Traffic observability metrics created by Amazon ECS Service Connect can be viewed directly from the Amazon ECS console and these metrics are sent to CloudWatch as well. Once these metrics are made available in CloudWatch, you can use them to configure auto scaling of your Amazon ECS service tasks.

The following are the most commonly used metrics that are made available through Amazon ECS Service Connect and this is the complete list.

ActiveConnectionCount	The total number of concurrent connections active from clients
NewConnectionCount	The total number of new connections established from clients
TargetResponseTime	The latency of the application request processing
ClientTLSNegotiationErrorCount	The total number of times the TLS connection failed
HTTPCode_Target_5XX_Count	The number of HTTP response codes with numbers 500 to 599

With ECS Service Connect, you can refer and connect to your services by logical names using a namespace provided by AWS Cloud Map and automatically distribute traffic between Amazon ECS tasks without deploying and configuring load balancers. Additionally, the Amazon ECS console provides easy-to-use dashboards with real-time network traffic metrics for operational convenience and simplified debugging. The following diagram illustrates the life of a request with ECS Service Connect and how metrics are published to CloudWatch.

ECS Service Connect – life of a request

The following screenshot illustrates the traffic health view, which provides you with a snapshot of the traffic measured by Service Connect. This view displays the number of active connections through Service Connect and the HTTP status of completed connections through Service Connect for all tasks within the service. ECS Service Connect is fully supported in AWS CloudFormation, AWS Cloud Development Kit (AWS CDK), AWS Copilot, and AWS Proton for infrastructure provisioning, code deployments, and monitoring of your services.

ECS Service Connect traffic health

Pricing for Amazon ECS Service Connect

There is no additional charge for service discovery, connectivity features, or traffic telemetry generated by ECS Service Connect. You only pay for the resources that are consumed by ECS Service Connect agent. Our recommendation is to add 256 CPU units and 64 mb of memory to your task for the Service Connect container. It is not something that you would configure separately, but it is accounted to when you do your task allocation. In the idle state, we have observed that the Service Connect agent consumes about 100 CPU units and 40 mb of memory.

ADOT

The OpenTelemetry (OTEL) Collector includes components for exporting data to various data sources, such as CloudWatch. The AWS Distro for OpenTelemetry Collector (ADOT Collector) is an AWS-supported version of the upstream OTEL Collector, which is distributed and supported by AWS. This component enables you to send telemetry data to CloudWatch and other supported monitoring solutions.

Enabling observability in your Amazon ECS using ADOT consists of two steps:

Instrumenting your applicationsonce to send correlated logs, metrics, and traces to one or more monitoring solutions. You can use auto-instrumentation (by enabling the app with programming language-based agent) without changing your code.
Deploying the ADOT Collector using one of the following two methods:
1. Side Car pattern
2. Amazon ECS Service pattern

The ADOT collector scrapes ECS metadata endpoint and collects container metrics (such as CPU, memory, network, and disk). By integrating ADOT with CloudWatch, you can collect, analyze, and visualize these metrics directly from the CloudWatch console. The following diagram illustrates a high-level flow map of how ADOT SDK is used to instrument your application, how metrics are scraped by the ADOT collector, and how CloudWatch EMF exporter sends those metrics to CloudWatch.

ADOT - data flow diagram

ADOT collector provides a default configuration out-of-box that enables the collection of metrics. The default configuration enables AWS EMF exporter (awsemfexporter) in the metrics pipeline. It converts OTEL metrics to CloudWatch EMF batched logs before sending to the CloudWatch. Now, from your CloudWatch console, you can query the logs, visualize through dashboards, and create metric alarms.

You can also customize the type of metrics that your ECS hosted application can send to CloudWatch. For example, you can collect the following metrics: ecs.task.memory.utilized, ecs.task.memory.reserved, ecs.task.cpu.utilized, ecs.task.cpu.reserved, ecs.task.network.rate.rx, ecs.task.network.rate.tx, ecs.task.storage.read.bytes, and ecs.task.storage.write.bytes.

Once these metrics are made available in CloudWatch, you can use it to configure auto scaling of your Amazon ECS service tasks.

Pricing for ADOT

There is no cost for using AWS Distro for OpenTelemetry. You pay for traces, logs, and metrics sent to CloudWatch. With CloudWatch, there is no up-front commitment or minimum fee. You simply pay for what you use. Check the CloudWatch pricing page for further details.

Conclusion

In this post, you learned about native AWS scaling options that can be leveraged to scale your Amazon ECS service. Companies of various sizes can adopt this approach to scale Amazon ECS service as part of their strategy to maintain a reliable and consistent end-user experience. The right mechanism depends on considerations such as deployment agility, proactive/reactive/predictive scaling, and pricing.

For more information, refer to the AWS Well-Architected Framework, Best Practices for running your application with Amazon ECS, and Security in Amazon Elastic Container Service. We are here to help and if you need further assistance, reach out to AWS Support and your AWS account team.

Containers

Scale your Amazon ECS using different AWS native services!

Application Auto Scaling

Amazon ECS Service Connect

ADOT

Conclusion

Resources

Follow