Understanding Burst vs. Baseline Performance with Amazon RDS and GP2
When we think about database storage, the dimensions that matter are the size, latency, throughput, and IOPS of the volume. IOPS stands for input/output (operations) per second, and latency is a measure of the time it takes for a single I/O request to complete. As you can imagine, latency and IOPS are closely related and are key indicators of database performance. This post focuses on understanding how to work with Amazon RDS storage and how it relates to IOPS.
Amazon RDS volume types
Amazon RDS volumes are built using Amazon EBS volumes, except for Amazon Aurora, which uses an SSD-backed virtualized storage layer purpose-built for database workloads. RDS currently supports both magnetic and SSD-based storage volume types. However, magnetic volumes are slower and don’t perform consistently, so they are not recommended for performance-focused workloads. So if you’re reading this to learn how to get better performance out of your RDS database, you should avoid magnetic storage.
There are two supported Amazon EBS SSD-based storage types, Provisioned IOPS (called io1) and General Purpose (called gp2). With io1, it’s quite simple to predict IOPS because this is the value you provide when the volume is created. The Amazon EBS documentation states that io1 volumes deliver within 10 percent of the Provisioned IOPS performance 99.9 percent of the time over a given year. In other words, you can expect consistent performance with io1.
The gp2 storage type also has a base IOPS that is set when the volume is created. However, you don’t provide a value for the IOPS directly—instead, IOPS is a function of the size of the volume. The IOPS for a gp2 volume is the size of the volume in GiB x 3, with a minimum of 100 IOPS and a maximum of 10K IOPS. The gp2 volumes have a characteristic called burst mode that is often misunderstood. Let’s delve into the performance characteristics of gp2 and understand burst versus baseline performance.
Comparing performance: burst vs. baseline examples
To understand burst mode, you must be aware that every gp2 volume regardless of size starts with 5.4 million I/O credits at 3000 IOPS. This means that even for very small volumes, you start with a high-performing volume. This is ideal for “bursty” workloads, such as daily reporting and recurring extract, transform, and load (ETL) jobs. It is also good for workloads that don’t require high-sustained IOPS.
How does this work? Well, as stated earlier, the gp2 volumes start with I/O credit that, if fully used, works out to 3000 IOPS for 30 minutes. The burst credit is always being replenished at the rate of 3 IOPS per GiB per second. Consider a daily ETL workload that uses a lot of I/O. For the daily job, gp2 can burst, and during downtime, burst credit can be replenished for the next day’s run. Now let’s consider a workload that never consumes more IOPS than the burst. Such a workload will continue to see very good IOPS as long as credits are replenished faster than they are consumed.
In the following example, I created an Amazon RDS instance with a 20 GiB gp2 volume. Such a volume bursts to 3000 IOPS. But once the burst is exhausted, it delivers only 100 IOPS, since 100 is the minimum IOPS. The point to stress here is that the small volume performs very well for this simulated nightly job. And then over the course of the next 12 hours, the burst credits accumulate in time for the next day’s nightly job.
An important thing to note is that for any gp2 volume larger than 1 TiB, the baseline performance is greater than the burst performance. For such volumes, burst is irrelevant because the baseline performance is better than the 3,000 IOPS burst performance.
Note: In the following diagram, > 1 TiB is where baseline performance exceeds burst IOPS.
In the following example, I run a workload that is designed to consume all available burst credits to compare the performance characteristics of different gp2 volume sizes. It’s an OLTP benchmark consisting of 10 concurrent threads that perform random updates against a 100 GiB table. This was run on five Amazon RDS MySQL databases on different sizes of the gp2 data volumes. Each instance has a different baseline performance, but of course they all have the same 3000 IOPS burst performance.
The following is the graph showing baseline performance for each instance:
Note how initially all the instances performed at about the same level. This is because they all began with at least 3000 IOPS. However, for the smaller volumes, once the burst credit was exhausted, the performance dropped to a transaction rate that is relative to the baseline performance. The blue line represents the 1 TiB instance, where baseline is equal to burst performance. The 2 TiB instance (green line) with 6000 IOPS did better.
We hope that this post gives you a better understanding of burst versus baseline performance on gp2, and helps you choose the appropriate storage volume type for your application. If you don’t need the high level of consistent performance of io1 or a large amount of IOPS, gp2 is a good choice for many database workloads.
About the Author
Phil Intihar is a database engineer at Amazon Web Services.