AWS Database Blog

Improve native backup and restore performance in Amazon RDS for SQL Server

Amazon Relational Database Service (Amazon RDS) for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. As a fully managed database service, Amazon RDS for SQL Server takes automated backups of your DB instance during the backup window. If required, you can restore your DB instance to a specific point in time as per your backup retention.

In addition to automated snapshots, Amazon RDS for SQL Server also supports native backup and restore in SQL Server. You may have to use a native backup and restore method to migrate individual databases from one RDS instance to another, or refresh your dev or QA environment by taking a backup of the production database and restoring it to the environment.

Native backup and restore is a time-tested method for SQL Server migrations. You can utilize the differential and log backups on the source side to reduce the database migration timelines. However, you may wonder if you can rely on this strategy for large databases, such as 4 TB in size? What are the different methods in which you can improve the performance of the backup and restore process and thereby reduce the downtime required for migration activities? Is one method better than others?

In this post, we demonstrate how to improve native backup and restore performance by tuning the configurations and testing different strategies along with estimates in Amazon RDS for SQL Server.

Native backup and restore in Amazon RDS for SQL Server

For the context of this post, it’s important to understand how native backup and restore works in Amazon RDS for SQL Server. For an overview, see Support for native backup and restore in SQL Server.

To learn about the technical implementation of performing backup and restore in Amazon RDS for SQL Server, see Importing and exporting SQL Server databases. This documentation provides different configurations that you can use with Amazon RDS for SQL Server backup and restore, such as using compression or multi-file backups.

Set up lab servers

The goal of our testing was to find out which configuration, including instance types and storage types, gives the best performance for backups and restores. We started with two lab servers:

  • Server A – RDS SQL 2017 R5.8XLarge
  • Server B – RDS SQL 2017 R5.24XLarge

Then we created a dummy database, filled with random large tables. The database was grown up to 1 TB and 4 TB in size for the tests.

No other active workloads were running on these lab machines. The only load running on these servers were backup and restore queries. The backup and restore performance in your Amazon RDS environment depends on the user or application queries and any other active workloads.

The following table summarizes the storage configuration that was tested with each lab server.

Server Instance Storage Size GP2\IO1 PIOPS Configuration
RDS SQL 2017 R5.8XLarge 1.5 TB GP2
RDS SQL 2017 R5.8XLarge 1.5 TB IO1 9000
RDS SQL 2017 R5.8XLarge 4.5 TB GP2
RDS SQL 2017 R5.8XLarge 4.5 TB IO1 30,000
RDS SQL 2017 R5.24XLarge 1.5 TB GP2
RDS SQL 2017 R5.24XLarge 1.5 TB IO1 9000
RDS SQL 2017 R5.24XLarge 4.5 TB GP2
RDS SQL 2017 R5.24XLarge 4.5 TB IO1 35,000

In your environment, you should also consider testing with the newer R5b instance types. For more information, see New – Amazon EC2 R5b Instances Provide 3x Higher EBS Performance.

Set up test cases

We tested the following configurations for performing backup and restore for 1 TB and 4 TB databases:

  • Backup and restore with GP2 storage
  • Backup and restore with compression enabled and GP2 storage
  • Backup and restore with compression enabled, multiple files (four), and GP2 storage
    • The backup file was divided into four multiple files:
exec msdb.dbo.rds_backup_database 
@source_db_name='mydatabase',
@s3_arn_to_backup_to='arn:aws:s3:::mybucket/backup*.bak',
@type='FULL', 
@number_of_files=4;
  • Backup and restore with compression enabled, multiple files (eight), and GP2 storage
    • The backup file was divided into eight multiple files:
exec msdb.dbo.rds_backup_database 
@source_db_name='mydatabase',
@s3_arn_to_backup_to='arn:aws:s3:::mybucket/backup*.bak',
@type='FULL',
@number_of_files=8; 
  • Backup and restore with multiple files (four) and GP2 storage
  • Backup and restore with multiple files (eight) and GP2 storage
  • Backup and restore with io1 storage
  • Backup and restore with compression enabled and io1 storage
  • Backup and restore with compression enabled, multiple files (four), and io1 storage
  • Backup and restore with compression enabled, multiple files (eight), and io1 storage
  • Backup and restore with multiple files (four) and io1 storage
  • Backup and restore with multiple files (eight) and io1 storage

Test results

The following table summarizes the results we found for the 1 TB database.

(All the numbers represent minutes) Full Backup Times (1 TB) Full Restore Times (1 TB)
r5x8large r5x24large Performance r5x8large r5x24large Performance
GP2 179 175 x 247 273 x
GP2+Compression 90 89 ~2x faster 138 127 ~2x faster
GP2+Compression+Multiple Files (4) 66 66 ~2.7x faster 79 79 ~3x faster
GP2+Compression+Multiple Files (8) 66 66 ~2.7x faster 82 79 ~3x faster
GP2+Multiple Files (4) 65 65 ~2.7x faster 73 74 ~3x faster
GP2+ Multiple (8) 65 71 ~2.7x faster 78 75 ~3x faster
IOPS (9,000) 172 177 ~same performance 175 383
IOPS+Compression 85 87 ~2.1x faster 82 124 ~3x faster
IOPS+Compression+Multiple Files (4) 33 34 ~5.4x faster 37 40 ~6.5x faster
IOPS+Compression+Multiple Files (8) 33 33 ~5.4x faster 36 37 ~6.5x faster
IOPS+Multiple Files (4) 55 57 ~3.2x faster 50 76 ~4x faster
IOPS+ Multiple (8) 49 53 ~3.5x faster 48 68 ~4x faster

The following table summarizes the results we captured for the 4 TB database.

(All the numbers represent minutes) Full Backup Times (4 TB) Full Restore Times (4 TB)
r5x8large r5x24large Performance r5x8large r5x24large Performance
GP2 542 537 x 740 763 x
GP2+Compression 223 224 ~2.4x faster 281 280 ~2.6x faster
GP2+Compression+Multiple Files (4) 202 201 ~2.5x faster 281 281 ~2.6x faster
GP2+Compression+Multiple Files (8) 204 202 ~2.5x faster 281 ~2.6x faster
GP2+Multiple Files (4) 202 201 ~2.5x faster 281 ~2.6x faster
GP2+ Multiple (8) 208 208 ~2.5x faster
IOPS (30,000) 540 ~same performance 826 757
IOPS (30,000)+Compression 164 ~3x faster 194 210 ~3.6x faster
IOPS (30,000)+Compression+Multiple Files (4) 102 102 ~5.3x faster 140 141 ~5.2x faster
IOPS (30,000)+Compression+Multiple Files (8) 101 101 ~5.3x faster 138 140 ~5x faster
IOPS (30,000)+Multiple Files (4) 173 ~3.1x faster 168 175 ~4x faster
IOPS (30,000)+ Multiple (8) 134 ~4x faster 141 141 ~5.2x faster
IOPS (35,000)+Compression+Multiple Files (4) 94 93 ~5.7x faster 129 129 ~5.7x faster
IOPS (35,000)+Compression+Multiple Files (8) 93 93 ~5.7x faster 129 130 ~5.7x faster

Observations

Based on my results, we can make the following observations about backup and restore performance:

  • IOPS with compression and multiple files gives the best performance. This configuration could reduce the time taken for a backup and restore by more than five times. However, there are trade-offs for this performance:
    • IOPS costs more than GP2.
    • Compression requires CPU; it might affect other concurrent transactions running on SQL Server and cause overall performance degrade.
  • You can use multi-file backup to improve backup performance by approximately 2.5 times and restore performance by 2.6 times. By using this functionality, you don’t have change any configuration on your existing Amazon RDS for SQL Server, such as IOPS or enabling compression.
  • The reason we didn’t see a difference between R5.XLarge and R5.24XLarge is because we didn’t have any active workloads running on these machines. Backup and restore were the only processes running on SQL Server. Therefore, there was no contention of resources on SQL Server. In a real-world scenario, when the backup and restore process competes with other database transactional processes, a bigger machine (with more resources) gives a better performance.
  • IOPS configuration affects backup and restore performance. When we increased the IOPS from 30,000 to 35,000, performance improved. You should adjust the IOPS setting during the migration window to improve performance.

Summary

If you’re looking for ways to improve backup and restore time for your large databases, the results and the observations in this post should help you form a good strategy. The post lists some of the tools and techniques to reduce time taken for backup and restore of large databases. More importantly, it can help you get an estimate of the performance gain with each configuration, which you can use to plan migration for your large databases.

Try one of the configurations discussed in the post and leave a comment to let us know what results you saw in your environment!


About the Author

Anuja Malik has been working with database for 15 years. Ex-Microsoft, she was the Database Architect for Thomson Reuters where she led the datacenter exit program and helped Thomson Reuters move to AWS. Anuja is a builder; Powershell and Python are her favorite tools. She is passionate about managed database services and helping customers adopt best practices for operational excellence and cost savings.