Improving application performance and reducing costs with Amazon EBS-Optimized Instance burst capability

Contributed by Sooraj Prasannan, Senior Product Manager, Amazon Elastic Block Store

In November 2017, Amazon EC2 introduced C5 compute-intensive instances and M5 general-purpose instances. In the first half of 2018, we released EC2 C5d instances and M5d instances by adding high-speed, ultra-low latency local NVMe storage to the EC2 C5 and M5 instance families. EC2 C5/C5d and M5/M5d instances are built on the Nitro system. This collection of AWS-built hardware and software components enables high performance, high availability, high security, and bare metal capabilities to reduce virtualization overhead.

During the design of the Nitro system, we analyzed real-world workloads and recognized the need for smaller instance sizes to drive higher performance from their Amazon EBS volumes. We found that the majority of application storage needs are bursty, with short, intense periods of high I/O and plenty of idle time between bursts. To improve the experience for these workloads, we developed burst capability for smaller instance sizes. Available on EC2 C5/C5d and M5/M5d instances, this feature enables large, xlarge, and 2xlarge instance sizes to drive the same performance as the 4xlarge instance for at least 30 minutes each day.

For applications with spiky Amazon EBS demand, you can right-size your instances based on your CPU and memory requirements and still meet your EBS-optimized instance performance requirements. This higher performance also enables you to speed up sections of your workflow dependent on EBS-optimized instance performance. Faster workflows result in quicker job completions and improved resource utilization. The burst capability ultimately enables you to reduce costs by right-sizing your instance and improving total resource usage.

With this performance increase, you will be able to handle unplanned spikes in demand without any impact to your application performance. You can now size your instances based on historical average trends. This burst capability gives you more performance to absorb spikes without affecting your customer experience.

Using Amazon CloudWatch metrics to monitor burst usage

For better visibility into your performance, instances based on the Nitro system provide Amazon CloudWatch metrics to help profile your usage. Based on the usage profile, you can decide if smaller instances meet your requirements.

These instances give you the ability to monitor your usage via instance level CloudWatch metrics for operations (EBSReadOpsandEBSWriteOps) and bytes transferred (EBSReadBytesand EBSWriteBytes). For more information on these metrics, see List of available CloudWatch metrics for your instances. These metrics support basic monitoring (five-minute frequency) by default, but you can enable detailed monitoring (one-minute frequency) for an additional cost. For more information, see Amazon CloudWatch pricing.

For large, xlarge, and 2xlarge instances, we also provide burst balance metrics. EBSIOBalance% monitors the instance I/O burst bucket, and EBSByteBalance% monitors the instance byte burst bucket. These metrics give information about the percentage of I/O or bytes credits remaining in the respective burst buckets. The metrics are expressed as a percentage, where 100% means that the instance has accumulated the maximum number of credits. You can set up an alarm that triggers if the balance gets too low.

To demonstrate these metrics, we launched an m5.large instance. We then attached a 500GB io1 Amazon EBS volume with 32,000 provisioned IOPS to the instance. Amazon EBS volumes attached to instances based on the Nitro system are exposed as NVMe devices.

First, we ran a large block (128 KiB) test using fio to /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 and monitored both EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=128k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=large-block-test

Because this is a large block workload, it’s not driving enough IOPS to deplete EBSIOBalance%. It depletes EBSByteBalance% instead, as shown in the following image.

Then we ran a small block test to understand how it affects EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=16k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=small-block-test

Because this is a small block test, it drives higher IOPS than bytes/second. Hence, EBSIOBalance% drops faster than EBSByteBalance%, as shown in the following image.

As long as EBSIOBalance% and EBSByteBalance% are above 0%, the instance can drive the burst performance. When the instance I/O activity is below the baseline rate, the burst buckets refill. After the tests finished, we paused all I/O from the instance. This period of inactivity allows the instance burst buckets to refill, as EBSIOBalance% and EBSByteBalance% show in the following image.

The refill rate for a burst bucket is the difference between the baseline rate and the instance I/O activity. For example, m5.large has a baseline throughput rate of 60 MB/s and a baseline IOPS rate of 3600 IOPS. Suppose the instance I/O activity is 10 MB/s and 1000 IOPS. The byte bucket fills at the rate of 50 MB/s (60 MB/s minus 10 MB/s). The IOPS bucket fills at the rate of 2600 IOPS (3600 IOPS minus 1000 IOPS). For the baseline rates for the different instances, see Amazon EBS–optimized instances. In addition, we top off the burst buckets every 24 hours, which means that the instance has burst performance available for at least 30 minutes each day.

Performance enhancements

We have continued to make enhancements to the Nitro system. With the latest set of enhancements, we have increased the maximum burst bandwidth on the large, xlarge, and 2xlarge EC2 C5/C5d and M5/M5d instances to 3.5 Gbps, up from 2.25 Gbps and 2.12 Gbps, respectively. We have also increased the maximum burst IOPS for EC2 C5/C5d to 20,000 IOPS and to 18,750 IOPS for M5/M5d, up from 16,000 IOPS for both. All new EC2 C5/C5d and M5/M5d smaller instances can take advantage of this performance increase at no additional cost.

For the latest list of instances based on the Nitro system that support this burst feature and their corresponding performance numbers, see Amazon EBS–optimized instances.

AWS Compute Blog

Improving application performance and reducing costs with Amazon EBS-Optimized Instance burst capability

Using Amazon CloudWatch metrics to monitor burst usage

Performance enhancements

Resources

Follow