Guidance for Optimizing Heterogeneous Auto Scaling Group Resource Utilization on AWS

[SEO Subhead]

This Guidance shows how to use Elastic Load Balancing (ELB) to efficiently distribute traffic across heterogeneous Auto Scaling groups. When scaling is based on metrics like CPU utilization or RAM, hot spots can form, with smaller instances reaching their full capacity even as larger instances remain underutilized. With existing routing algorithms, ELB cannot route requests proportionately to targets with different weights or sizes. This Guidance provides a balanced distribution method that considers the capacity of individual target groups. As a result, you can minimize hot spots and time-outs, raise scale-out threshold limits, and reduce compute costs and CO2 emissions.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF

Guidance Architecture Diagram for Optimizing Heterogeneous Auto Scaling Group Resource Utilization on AWS

Step 1
Configure an AWS Lambda function with a load balancer Amazon Resource Name (ARN) through Elastic Load Balancing (ELB) and with listener ARNs. You can have multiple listeners for each load balancer.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

EventBridge implements Lambda functions at specified intervals so that you can automate the use of heterogeneous instances in an Auto Scaling group. CloudWatch monitors and logs events, such as interruptions to Amazon EC2 Spot Instances, which you can further configure in the Spot Instance interruption dashboard.

Read the Operational Excellence whitepaper
Security

AWS Identity and Access Management (IAM) grants Lambda only the permissions that are necessary for it to modify listener weights on the specified ELB ARN and listener ARNs. By using the principle of least privilege, you can better control access to AWS resources.

Read the Security whitepaper
Reliability

This Guidance is an enhanced optimization solution built on a reliable architecture. It uses ELB for traffic distribution so that you will not have a single point of failure. ELB uses synchronous loose coupling to avoid directing traffic to overloaded Amazon EC2 instances, reducing the chance of application failure. As a result, you can minimize downtime errors. Additionally, this Guidance uses multiple weighted target groups backed by Auto Scaling groups. This adds an additional layer of reliability, because even if one Auto Scaling group cannot get enough Amazon EC2 instances, other Auto Scaling groups can support inbound traffic, helping you avoid application failures.

Read the Reliability whitepaper
Performance Efficiency

This Guidance helps you avoid hot spots in heterogenous clusters by balancing the use of larger and smaller Amazon EC2 instances. Additionally, it is an AWS best practice to use a mix of instance sizes and types in a single Auto Scaling group and to use Spot Instances as a diversification tactic. However, this practice can result in unbalanced utilization, so this Guidance uses multiple Auto Scaling groups with similar capability and adjusts the weights at ELB listener forwarders to maximize resource utilization.

Read the Performance Efficiency whitepaper
Cost Optimization

Previously, to avoid timeouts and hot spots, you might have needed to artificially keep utilization below 40 percent. Now, by using this Guidance, you can increase resource utilization by distributing traffic with dynamic weighted routing from ELB. Depending on your application and requirements, you might even be able to achieve utilization of 60 to 70 percent, leading to significant Amazon EC2 cost savings. Additionally, Spot Instances provide a discount of up to 90 percent compared to Amazon EC2 On-Demand Pricing. Finally, Auto Scaling groups scale up and down depending on traffic and usage patterns to optimize resource utilization, and the use of multiple Auto Scaling groups in a dynamic weighted routing solution provides another layer of resource optimization, ultimately resulting in cost savings.

Read the Cost Optimization whitepaper
Sustainability

This Guidance uses Auto Scaling groups that automatically scale in or terminate idle resources. Additionally, the use of multiple homogeneous Auto Scaling groups with similarly capable Amazon EC2 instances helps eliminate hot spots, increase utilization, and reduce wastage, power, and cooling. Spot Instance utilization in an Auto Scaling group also directly contributes to sustainability, because these instances consume the same energy resources as On-Demand Instances, but they would be unused otherwise.

Read the Sustainability whitepaper

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open sample code on GitHub

Related Content

[Content Type]

[Title]

This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.