Improve native backup and restore performance in Amazon RDS for SQL Server
Amazon Relational Database Service (Amazon RDS) for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. As a fully managed database service, Amazon RDS for SQL Server takes automated backups of your DB instance during the backup window. If required, you can restore your DB instance to a specific point in time as per your backup retention.
In addition to automated snapshots, Amazon RDS for SQL Server also supports native backup and restore in SQL Server. You may have to use a native backup and restore method to migrate individual databases from one RDS instance to another, or refresh your dev or QA environment by taking a backup of the production database and restoring it to the environment.
Native backup and restore is a time-tested method for SQL Server migrations. You can utilize the differential and log backups on the source side to reduce the database migration timelines. However, you may wonder if you can rely on this strategy for large databases, such as 4 TB in size? What are the different methods in which you can improve the performance of the backup and restore process and thereby reduce the downtime required for migration activities? Is one method better than others?
In this post, we demonstrate how to improve native backup and restore performance by tuning the configurations and testing different strategies along with estimates in Amazon RDS for SQL Server.
Native backup and restore in Amazon RDS for SQL Server
For the context of this post, it’s important to understand how native backup and restore works in Amazon RDS for SQL Server. For an overview, see Support for native backup and restore in SQL Server.
To learn about the technical implementation of performing backup and restore in Amazon RDS for SQL Server, see Importing and exporting SQL Server databases. This documentation provides different configurations that you can use with Amazon RDS for SQL Server backup and restore, such as using compression or multi-file backups.
Set up lab servers
The goal of our testing was to find out which configuration, including instance types and storage types, gives the best performance for backups and restores. We started with two lab servers:
- Server A – RDS SQL 2017 R5.8XLarge
- Server B – RDS SQL 2017 R5.24XLarge
Then we created a dummy database, filled with random large tables. The database was grown up to 1 TB and 4 TB in size for the tests.
No other active workloads were running on these lab machines. The only load running on these servers were backup and restore queries. The backup and restore performance in your Amazon RDS environment depends on the user or application queries and any other active workloads.
The following table summarizes the storage configuration that was tested with each lab server.
|Server||Instance Storage Size||GP2\IO1||PIOPS Configuration|
|RDS SQL 2017 R5.8XLarge||1.5 TB||GP2|
|RDS SQL 2017 R5.8XLarge||1.5 TB||IO1||9000|
|RDS SQL 2017 R5.8XLarge||4.5 TB||GP2|
|RDS SQL 2017 R5.8XLarge||4.5 TB||IO1||30,000|
|RDS SQL 2017 R5.24XLarge||1.5 TB||GP2|
|RDS SQL 2017 R5.24XLarge||1.5 TB||IO1||9000|
|RDS SQL 2017 R5.24XLarge||4.5 TB||GP2|
|RDS SQL 2017 R5.24XLarge||4.5 TB||IO1||35,000|
In your environment, you should also consider testing with the newer R5b instance types. For more information, see New – Amazon EC2 R5b Instances Provide 3x Higher EBS Performance.
Set up test cases
We tested the following configurations for performing backup and restore for 1 TB and 4 TB databases:
- Backup and restore with GP2 storage
- Backup and restore with compression enabled and GP2 storage
- Backup and restore with compression enabled, multiple files (four), and GP2 storage
- The backup file was divided into four multiple files:
- Backup and restore with compression enabled, multiple files (eight), and GP2 storage
- The backup file was divided into eight multiple files:
- Backup and restore with multiple files (four) and GP2 storage
- Backup and restore with multiple files (eight) and GP2 storage
- Backup and restore with io1 storage
- Backup and restore with compression enabled and io1 storage
- Backup and restore with compression enabled, multiple files (four), and io1 storage
- Backup and restore with compression enabled, multiple files (eight), and io1 storage
- Backup and restore with multiple files (four) and io1 storage
- Backup and restore with multiple files (eight) and io1 storage
The following table summarizes the results we found for the 1 TB database.
|(All the numbers represent minutes)||Full Backup Times (1 TB)||Full Restore Times (1 TB)|
|GP2+Compression||90||89||~2x faster||138||127||~2x faster|
|GP2+Compression+Multiple Files (4)||66||66||~2.7x faster||79||79||~3x faster|
|GP2+Compression+Multiple Files (8)||66||66||~2.7x faster||82||79||~3x faster|
|GP2+Multiple Files (4)||65||65||~2.7x faster||73||74||~3x faster|
|GP2+ Multiple (8)||65||71||~2.7x faster||78||75||~3x faster|
|IOPS (9,000)||172||177||~same performance||175||383|
|IOPS+Compression||85||87||~2.1x faster||82||124||~3x faster|
|IOPS+Compression+Multiple Files (4)||33||34||~5.4x faster||37||40||~6.5x faster|
|IOPS+Compression+Multiple Files (8)||33||33||~5.4x faster||36||37||~6.5x faster|
|IOPS+Multiple Files (4)||55||57||~3.2x faster||50||76||~4x faster|
|IOPS+ Multiple (8)||49||53||~3.5x faster||48||68||~4x faster|
The following table summarizes the results we captured for the 4 TB database.
|(All the numbers represent minutes)||Full Backup Times (4 TB)||Full Restore Times (4 TB)|
|GP2+Compression||223||224||~2.4x faster||281||280||~2.6x faster|
|GP2+Compression+Multiple Files (4)||202||201||~2.5x faster||281||281||~2.6x faster|
|GP2+Compression+Multiple Files (8)||204||202||~2.5x faster||281||–||~2.6x faster|
|GP2+Multiple Files (4)||202||201||~2.5x faster||281||–||~2.6x faster|
|GP2+ Multiple (8)||208||208||~2.5x faster||–|
|IOPS (30,000)||540||–||~same performance||826||757|
|IOPS (30,000)+Compression||164||–||~3x faster||194||210||~3.6x faster|
|IOPS (30,000)+Compression+Multiple Files (4)||102||102||~5.3x faster||140||141||~5.2x faster|
|IOPS (30,000)+Compression+Multiple Files (8)||101||101||~5.3x faster||138||140||~5x faster|
|IOPS (30,000)+Multiple Files (4)||173||–||~3.1x faster||168||175||~4x faster|
|IOPS (30,000)+ Multiple (8)||134||–||~4x faster||141||141||~5.2x faster|
|IOPS (35,000)+Compression+Multiple Files (4)||94||93||~5.7x faster||129||129||~5.7x faster|
|IOPS (35,000)+Compression+Multiple Files (8)||93||93||~5.7x faster||129||130||~5.7x faster|
Based on my results, we can make the following observations about backup and restore performance:
- IOPS with compression and multiple files gives the best performance. This configuration could reduce the time taken for a backup and restore by more than five times. However, there are trade-offs for this performance:
- IOPS costs more than GP2.
- Compression requires CPU; it might affect other concurrent transactions running on SQL Server and cause overall performance degrade.
- You can use multi-file backup to improve backup performance by approximately 2.5 times and restore performance by 2.6 times. By using this functionality, you don’t have change any configuration on your existing Amazon RDS for SQL Server, such as IOPS or enabling compression.
- The reason we didn’t see a difference between R5.XLarge and R5.24XLarge is because we didn’t have any active workloads running on these machines. Backup and restore were the only processes running on SQL Server. Therefore, there was no contention of resources on SQL Server. In a real-world scenario, when the backup and restore process competes with other database transactional processes, a bigger machine (with more resources) gives a better performance.
- IOPS configuration affects backup and restore performance. When we increased the IOPS from 30,000 to 35,000, performance improved. You should adjust the IOPS setting during the migration window to improve performance.
If you’re looking for ways to improve backup and restore time for your large databases, the results and the observations in this post should help you form a good strategy. The post lists some of the tools and techniques to reduce time taken for backup and restore of large databases. More importantly, it can help you get an estimate of the performance gain with each configuration, which you can use to plan migration for your large databases.
Try one of the configurations discussed in the post and leave a comment to let us know what results you saw in your environment!
About the Author
Anuja Malik has been working with database for 15 years. Ex-Microsoft, she was the Database Architect for Thomson Reuters where she led the datacenter exit program and helped Thomson Reuters move to AWS. Anuja is a builder; Powershell and Python are her favorite tools. She is passionate about managed database services and helping customers adopt best practices for operational excellence and cost savings.