AWS Compute Blog

Building Sustainable, Efficient, and Cost-Optimized Applications on AWS

This blog post is written by Isha Dua Sr. Solutions Architect AWS, Ananth Kommuri Solutions Architect AWS, Dr. Sam Mokhtari Sr. Sustainability Lead SA WA for AWS, and Adam Boeglin, Principal Specialist  EC2 Sustainability.

Today, more than ever, sustainability and cost-savings are top of mind for nearly every organization. Research has shown that AWS’ infrastructure is 3.6 times more energy efficient than the median of U.S. enterprise data centers and up to five times more energy efficient than the average in Europe. That said, simply migrating to AWS isn’t enough to meet the Environmental, Social, Governance (ESG) and Cloud Financial Management (CFM) goals that today’s customers are setting. In order to make conscious use of our planet’s resources, applications running on the cloud must be built with efficiency in mind.

That’s because cloud sustainability is a shared responsibility. At AWS, we’re responsible for optimizing the sustainability of the cloud – building efficient infrastructure, enough options to meet every customer’s needs, and the tools to manage it all effectively. As an AWS customer, you’re responsible for sustainability in the cloud – building workloads in a way that minimizes the total number of resource requirements and makes the most of what must be consumed.

Most AWS service charges are correlated with hardware usage, so reducing resource consumption also has the added benefit of reducing costs. In this blog post, we’ll highlight best practices for running efficient compute environments on AWS that maximize utilization and decrease waste, with both sustainability and cost-savings in mind.

Measure What Matters

Application optimization is a continuous process, but it has to start somewhere. The AWS Well Architected Framework Sustainability pillar includes an improvement process that helps customers map their journey and understand the impact of possible changes. There is a saying “you can’t improve what you don’t measure.”, which is why it’s important to define and regularly track metrics which are important to your business. Scope 2 Carbon emissions, such as those provided by the AWS Customer Carbon Footprint Tool, are one metric that many organizations use to benchmark their sustainability initiatives, but they shouldn’t be the only one.

Amazon is on path to reach 100% renewable energy by 2025 – five years ahead of its original 2030 commitment. It is very important to maximize the utilization and minimize the total energy consumption of the resources that you use. That’s why many organizations use proxy metrics such as vCPU Hours, storage usage, and data transfer to evaluate their hardware consumption and measure improvements made to infrastructure over time.

In addition to these metrics, it’s helpful to baseline utilization against the value delivered to your end-users and customers. Tracking utilization alongside business metrics (orders shipped, page views, total API calls, etc) allows you to normalize resource consumption with the value delivered to your organization. It also provides a simple way to track progress towards your goals over time. For example, if the number of orders on your ecommerce site remained constant over the last month, but your AWS infrastructure usage decreased by 20%, you can attribute the efficiency gains to your optimization efforts, not changes in your customer behavior.

Choose efficient, purpose-built processors whenever possible

Choosing the right processor for your application is important consideration for sustainability. That’s because faster, more efficient processors allow you to get the same amount of work done while using less energy. AWS has the broadest choice of processors, such as Intel – Xeon scalable processors, AMD – AMD EPYC processors, GPU’s, FPGAs, and custom ASICs for Accelerated Computing.

AWS Graviton3, AWS’s latest and most power-efficient processor, delivers 3X better CPU performance per-watt than any other processor in AWS, provides up to 40% better price performance over comparable current generation x86-based instances for various workloads, and can help customers reduce their carbon footprint. Consider transitioning your workload to Graviton-based instances to improve the performance efficiency of your workload (see AWS Graviton Fast Start and AWS Graviton2 for ISVs). Note the considerations when transitioning workloads to AWS Graviton-based Amazon EC2 instances.

For machine learning (ML) workloads, using Amazon EC2 instances based on purpose-built chips such as AWS Trainium and AWS Inferentia allow you to build and run ML models using less energy and higher performance per watt than comparable GPU powered instances.

Optimize for hardware utilization

The goal of efficient environments is to use only as many resources as required in order to meet your needs, so it’s important to continually validate that you’re using no more hardware than required. Thankfully, this is easier on the cloud because of the variety of instance choices, the ability to scale dynamically, and the wide array of tools to help track and optimize your infrastructure.

Two of the most important tools to measure and track utilization are Amazon CloudWatch and the AWS Cost & Usage Report (CUR). With CloudWatch, you can get a unified view of your resource metrics and usage, then analyze the impact of user load on capacity utilization over time. The Cost & Usage Report (CUR) can help you understand which resources are contributing the most to your AWS usage, allowing you to fine-tune your efficiency and save on costs.

A tool we provide, which is powered by CUR data, is the AWS Cost Intelligence Dashboard. The Cost Intelligence Dashboard provides a detailed, granular, and recommendation-driven view of your AWS usage. With its prebuilt visualizations, it can help you identify which service and underlying resources are contributing the most towards your AWS usage, and see the potential savings you can realize by optimizing. It even provides right sizing recommendations and the appropriate EC2 instance family to help you optimize your resources.

Cost Intelligence Dashboard is also integrated with AWS Compute Optimizer, which makes instance type and size recommendations based on workload characteristics. For example, it can identify if the workload is CPU-intensive, if it exhibits a daily pattern, or if local storage is accessed frequently. Compute Optimizer then infers how the workload would have performed on various hardware platforms (for example, Amazon EC2 instance types) or using different configurations (for example, Amazon EBS volume IOPS settings, and AWS Lambda function memory sizes) to offer recommendations. For stable workloads, check AWS Compute Optimizer at regular intervals to identify right-sizing opportunities for instances. By right sizing with Compute Optimizer, you can increase resource utilization and reduce costs by up to 25%.

CloudWatch metrics are used to power Amazon EC2 Auto Scaling, which can automatically choose the right instance to fit your needs with attribute-based instance selection and scale your entire instance fleet up and down based on demand in order to maintain high utilization. With scheduled, dynamic, and predictive scaling policies based on metrics such as average CPU utilization or average network in or out. Then, you can integrate AWS Instance Scheduler and Scheduled scaling for Amazon EC2 Auto Scaling to schedule shut downs and terminate resources that run only during business hours or on weekdays to further reduce your resource consumption and environmental footprint.

Utilize all of the available pricing models

Compute tasks form the foundation of many workloads, so compute infrastructure typically sees biggest benefit by optimization. Amazon EC2 provides resizable compute across a wide variety of compute instances, is well-suited to virtually every use case, is available via a number of highly flexible pricing options.

EC2 Spot instances are a great way to decrease cost and increase efficiency on AWS. Spot Instances make unused Amazon EC2 capacity available for customers at discounted prices. At AWS, one of our goals it to maximize utilization of our physical resources. By choosing EC2 Spot instances, you’re running on hardware that would otherwise be sitting idle in our datacenters. Using Spot instances increases the overall efficiency of the cloud, because more of our physical infrastructure is being used for meaningful work. Spot instances use market-based pricing that changes automatically based on supply and demand. This means that the hardware with the most spare capacity sees the highest discounts, sometimes up to 90% off on-demand prices, to encourage our customers to choose that configuration.

Because every workload has different requirements, we recommend a combination of purchase options tailored for your specific workload needs. For steady-state workloads that can have a 1-3 year commitment, using Compute Savings Plans helps you save costs, move from one instance type to a newer, more energy-efficient alternative, or even between compute solutions (e.g., from EC2 instances to AWS Lambda functions, or AWS Fargate).

Savings Plans are ideal for predicable, steady-state work. On-demand is best suited for new, stateful, and spiky workloads which can’t be instance, location, or time flexible. Finally, Spot instances are a great way to supplement the other options for applications that are fault tolerant and flexible. AWS recommends using a mix of pricing models based on your workload needs and ability to be flexible. By using these pricing models, you’re creating signals for your future compute needs, which helps AWS better forecast resource demands, manage capacity, and run our infrastructure in a more sustainable way.

Design applications to minimize overhead and use fewer resources

Regardless of your workload or technology choice, using the latest generation hardware with up-to-date libraries and Amazon Machine Images (AMI) will typically provide you the best price/performance and performance/watt within that family. Up-to-date software libraries are often required to take advantage of power-saving and performance features in modern processors and hardware, such as Intel’s DL Boost.

Taking advantage of managed services can help shift the responsibility for maintaining high resource utilization to AWS. Using managed services will help distribute the sustainability impact of the service across all of customers using that service, reducing each customer’s individual contribution. However, not all managed services are optimized by default. The following recommendations help reduce your environmental impact with automatically optimized capacity management for each manage service.

 AWS  Managed Service   Recommendation for sustainability improvement

Amazon Aurora

 Amazon  Aurora Serverless can automatically start up, shut down, and scale capacity up  or down based on your application’s needs.

Amazon Redshift

 Amazon Redshift Serverless runs and scales data warehouse capacity automatically.

AWS Lambda

 You can Migrate AWS Lambda functions to Arm-based AWS Graviton2 processors.

Amazon ECS

 You can run Amazon ECS on AWS Fargateto leverage sustainability best practices AWS put in place for management of the control plane.

Amazon EMR

 Using EMR Serverless avoids over or under provisioning resources for your data processing jobs.

AWS Glue

 Enable Auto-scaling for AWS Glue for on-demand scaling up and down of the Glue compute resources.

Data transfer is another application design choice that has a sustainability impact. Edge computing, storing and using data on or near the device it was created on, reduces the amount of traffic sent to the cloud and, at scale, can limit energy utilization and carbon emissions.  AWS OutpostsAWS Local Zones and AWS Wavelength services deliver data processing, analysis, and storage close to your endpoints, allowing you to deploy APIs and tools to locations outside AWS data centers. By processing data closer to the source, edge computing can reduce latency, which means that less energy is required to keep devices and applications running smoothly.

Conclusion

Ongoing measurement, hardware utilization, processor choice, and application design are all critical to optimizing your AWS compute infrastructure for resource efficiency. Put simply, sustainable infrastructure on the cloud uses as few resources as possible to achieve your organization’s goals. In addition to your proxy metrics, you can track the carbon output of your optimization journey combined with the advancements AWS makes to our infrastructure using the Customer Carbon Footprint Tool.

Ready to dive deeper? Check out the Well Architected Sustainability pillar for more detailed guidance across all of these topics. You can also visit the AWS Sustainability page to learn more about our commitment to sustainability, current AWS progress on renewable energy usage, case studies on sustainability through the cloud, and more.