Optimizing SAP HANA’s persistence layer with Amazon EBS gp3 volumes
SAP HANA is an in-memory, relational database which enterprises rely on to run their mission critical Enterprise Resource Planning (ERP) systems and analytical applications. As SAP HANA is an in-memory database, you may wonder why the storage layer is relevant. A key point is that memory is volatile. When you write data to a HANA database, that data stays in memory, but it also needs to be committed to disk in order to persist during reboots of the server. The performance of the transaction log volume is critical to make sure that this process is executed quickly. The performance of the data volume is also important because it handles data loads and HANA Savepoints, which are capable of generating tens of thousands of IOPS in a busy OLTP (online transaction processing) system. The data volume also determines how fast data is loaded when HANA boots, which can be critical for disaster recovery (DR) scenarios.
Amazon customers running SAP workloads require a fast and reliable storage solution for their systems using Amazon Elastic Compute (EC2) Instances. Amazon offers Amazon Elastic Block Store (EBS) volumes in various volume types, including gp2, io1, and io2, which are all SAP certified. In May of 2021, we also certified the gp3 volumes for production SAP workloads. In this blog, I discuss the benefits of using gp3 volumes for SAP HANA workloads and how it differs from gp2 volumes.
Before I begin with a gp2 and gp3 volume comparison, I’ll clarify the use cases of the previously mentioned EBS volume types. gp2 and gp3 are Amazon’s general SSD-based block storage, with io1 and io2 volumes offered as high-performance options. Our gp3 volume is the successor of the gp2 volume and our io2 volume is the successor of io1 volume.
Amazon EBS: gp2 and gp3 volumes comparison
Let’s dive into the differences between gp2 and gp3 volumes by starting with a feature called bursting. Bursting allows volumes smaller then 1 TB to reach 3000 IOPS for finite periods of time. Previously, when the burst credits for a gp2 volume ran out, you received the 3 IOPS/GB base level of performance. For gp3, Amazon has done away with bursting, and a gp3 volume always provides 3000 IOPS regardless of a volume’s size.
gp3 volumes deliver consistent performance – no burst credits required
gp2 bursting is fine for low intensity workloads that only require peak performance for short intervals. For example a system that runs a data load or other heavy badge job for an hour everyday but is relatively quiet aside from that. A problem with this feature is that If your SAP system is under a consistently heavy workload, you may run out of burst credits. This would impede application performance and customer satisfaction. If you don’t want to be exposed to this risk, you need to make your volumes large enough to achieve the required performance level based on the 3 IOPS/GB that gp2 offers. In the past, this usually meant over provisioning your volumes by a large margin leading to higher costs and under-utilized storage. gp3 volumes mitigate this issue because gp3 volumes can provide 500 IOPS per GB instead of 3 thereby doing away with over provisioning for performance reasons. Consequently, the migration from gp2 volumes to gp3 volumes can result in cost savings as well as peace of mind knowing that your volume has plenty of the headroom to support sudden increases in workload.
Where gp3 volumes can deliver 1000MB/s of throughput per volume, gp2 volumes support only up to 250 MB/s per volume. gp2 volumes require striping using a Logical Volume Manager (LVM) in order to meet the minimum required key performance indicator (KPI) for SAP HANA data and log volumes. We recommend striping multiple EBS volumes using a Logical Volume Manager (LVM). If you need 500 MB/s, you need to stripe two volumes together, and for 750 you would stripe three, etc. Striping scales almost linearly allowing you to reach the throughput limit of any EC2 instance with the exception of the U-series.
Striping can be very useful but the loss of a single volume in the set can result in data loss. If you have 1000 gp2 EBS volumes you should design your solution to account for 1 or 2 volume failures per year. As each volume has a statistically equal chance of failure, a filesystem consisting of multiple volumes has a bigger chance of failure then one that has only one volume. According to AWS’ Well Architected terminology, you can choose to make a trade-off to improve cost or performance at the expense of durability.
gp3 volumes improves performance and availability
gp3 volumes bring a massive improvement in throughput supporting up to 1000 MB/s per volume. This allows you to use fewer gp3 volumes as compared with gp2 for SAP HANA workloads. Using gp3 volumes results in increased durability for the storage layer. Increased throughput allows you to use single gp3 volume to meet SAP HANA’s KPI’s thereby reducing the risk of data loss due to single volume failure in a striped configuration. Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. This replication makes Amazon EBS volumes ten times more reliable than typical commodity disk drives. For more information, see the Amazon EBS features page.
EBS gp2 vs. gp3 specification comparison
Use CloudWatch to help select the best volume for your workload
What I described in the preceding paragraphs may require metric data in order to make an informed decision on which EBS storage type to choose and how-to configure it. One way to get this data is to monitor your EBS volumes with Amazon CloudWatch. It provides a number of metrics that can give you an insight into the utilization of your volumes. The following example shows the burst balance for a gp2 volume. If the burst balance is at 0%, bursting will stop and you’ll get the volume’s base level of performance. To repeat, the gp3 volume does not use bursting and will always provide the performance figures that are listed above.
CloudWatch dashboard displaying EBS metrics
Amazon EBS gp3 volumes allow you to configure a volume’s size and performance separately, and offers more IOPS per GB and increased throughput support for up to 1000MB/s per volume. Along with these improvements we were able to reduce the volume sizes in our best practice recommendations. The result is a substantial cost reduction compared to our gp2 volume.
Having said that, you may have a use-case that requires more or less than what we recommend in our best practices. I ‘ll show you what to consider when trying to accommodate this scenario. Let’s start by stating that adding extra IOPS and throughput to a gp3 volume are on-demand costs that increase the price of a volume. In the following table, I use an example where we increase a gp3 volumes throughput to 500MB/s and increase IOPS to 9000MB/s. The costs associated with that performance increase are fixed per volume and therefore have a larger impact on the pricing of smaller volumes than on large ones. If you look at the 5000GB volume in the table below, the storage costs are $400(USD) per month, with $45.16(USD) added for the extra performance, which is an additional 11%. For the 250GB volume that looks very different and that same $45.16(USD) more than doubles the price of the volume to 226%.
All prices listed in the table are for the us-east-1 Region (North-Virginia), per month, and in $US
The best practices that we provide for SAP HANA production systems account for more than just price. They factor in performance and durability as well to offer the best mix of these important attributes. However, you may have different requirements. An example of this could be a 500 GB test environment.
For testing you may prioritize performance by striping your volumes. You may also want to spend as little as possible to achieve this. Looking at line 4 in the preceding table, the 500 GB volume doubles in price due to the additional cost for increased throughput and IOPS. However, if we use volume striping and stripe four 125 GB standard gp3 volumes into a single filesystem you can also reach 500 MB/s and more IOPS at 12,000. This setup costs $40 (USD) instead of $85.16 (USD) which in a large environment with many systems can significantly impact your budget.
Using gp3 for SAP HANA
Let’s take a closer look at how to use gp3 for SAP HANA Databases. We will use an r5.24xlarge AWS EC2 Instance for this example and will compare the HANA Storage Best Practices for gp2 volumes and gp3 volumes as listed on our website in the table shown below.
Prices are per month and in $USD
As you can see our gp3 volume best practice uses less volumes and GB’s of storage, but still offers more base or guaranteed IOPS then the gp2 configuration did. The throughput is less, but the listed values are still in accordance with SAP’s requirements for production systems. You can always add more afterwards but AWS advises to do so only if your monitoring shows that you are hitting a volume’s limits or when your requirements change.
If you have studied the HANA storage best practices webpage then you may have noticed that the gp3 volume layout looks very similar to the layout for io1 and io2 volumes and this benefits you. A good example is if you are using io2 volume storage on your production system and want to do a system copy to a test environment. If you then create an Amazon Machine Image (AMI) of the production system all you need to do is convert the disks to gp3 volume upon launch resulting in a reduced cost for your test system. The same logic applies to attaching snapshot-copies of a database’s data and log volume to a different EC2 instance.
The recommendation for HANA databases is to start with our published best practices and to use CloudWatch monitoring to determine afterwards whether you may need more performance or possibly less. When less performance is sufficient, be sure to adhere to SAP’s support requirements. Consequently, you should never configure less than the following SAP recommendations:
- HANA Data Volume: 7000 IOPS and 425MB/s of Throughput
- HANA Log volume: 3000 IOPS and 275MB/s of Throughput
- Sizing as per SAP HANA TDI guidelines (link requires SAP credentials)
These guidelines apply to configurations using one volume with additional provisioned throughput/IOPS, and to striping multiple volumes with baseline performance. When using the latter, the sum of throughput and IOPS of all volumes should meet these limits.
For SAP workloads other than HANA there are no SAP prescribed minimum requirements or best practices. If you have no sizing information, the recommendation for SAP NetWeaver or Anydb workloads is to start sizing volumes based on the size of the dataset, and then use CloudWatch to monitor and check whether the volumes are hitting limits. Based on this information, adjust your volume sizing as necessary.
Common scenarios and solutions for HANA databases
There are situations where you may need a different level of performance and I want to provide a suggestion for the most common scenarios. For example, you may want to reduce SAP HANA’s startup time. To do this increase the throughput of the Data volume. If you need more performance for heavy SAP Business Warehouse data loading then you need to increase the volume specs across the board for both the HANA Data and log. For a heavy OLTP workload you can increase the IOPS on the HANA Data (Savepoints) and Log volume. If you want to speed up streaming backups then increase the throughput of the HANA Data volume. If you experience degraded performance overtime, then this is likely due to disk fragmentation. You can either use a third-party de-fragmentation tool or create a new set of volumes and migrate the data to those.
In conclusion, in this post I summarized the benefits of Amazon EBS gp3 volumes. You can configure a gp3 volumes size and performance separately. gp3 volumes can produce more IOPS per GB than gp2 thereby reducing the need to over provision for performance reasons. gp3 volumes have a higher maximum throughput than gp2 thereby reducing the need for volume striping. gp3 volumes always produce the base level of performance and do away with the concept of bursting. gp3 volumes make it easier to copy a production SAP HANA system that uses io1 or io2 to a non-production system, as the volume layout does not have to be changed.