Improve native backup and restore performance in Amazon RDS for SQL Server

Amazon Relational Database Service (Amazon RDS) for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. As a fully managed database service, Amazon RDS for SQL Server takes automated backups of your DB instance during the backup window. If required, you can restore your DB instance to a specific point in time as per your backup retention.

In addition to automated snapshots, Amazon RDS for SQL Server also supports native backup and restore in SQL Server. You may have to use a native backup and restore method to migrate individual databases from one RDS instance to another, or refresh your dev or QA environment by taking a backup of the production database and restoring it to the environment.

Native backup and restore is a time-tested method for SQL Server migrations. You can utilize the differential and log backups on the source side to reduce the database migration timelines. However, you may wonder if you can rely on this strategy for large databases, such as 4 TB in size? What are the different methods in which you can improve the performance of the backup and restore process and thereby reduce the downtime required for migration activities? Is one method better than others?

In this post, we demonstrate how to improve native backup and restore performance by tuning the configurations and testing different strategies along with estimates in Amazon RDS for SQL Server.

Native backup and restore in Amazon RDS for SQL Server

For the context of this post, it’s important to understand how native backup and restore works in Amazon RDS for SQL Server. For an overview, see Support for native backup and restore in SQL Server.

To learn about the technical implementation of performing backup and restore in Amazon RDS for SQL Server, see Importing and exporting SQL Server databases. This documentation provides different configurations that you can use with Amazon RDS for SQL Server backup and restore, such as using compression or multi-file backups.

Set up lab servers

The goal of our testing was to find out which configuration, including instance types and storage types, gives the best performance for backups and restores. We started with two lab servers:

Server A – RDS SQL 2017 R5.8XLarge
Server B – RDS SQL 2017 R5.24XLarge

Then we created a dummy database, filled with random large tables. The database was grown up to 1 TB and 4 TB in size for the tests.

No other active workloads were running on these lab machines. The only load running on these servers were backup and restore queries. The backup and restore performance in your Amazon RDS environment depends on the user or application queries and any other active workloads.

The following table summarizes the storage configuration that was tested with each lab server.

Server	Instance Storage Size	GP2\IO1	PIOPS Configuration
RDS SQL 2017 R5.8XLarge	1.5 TB	GP2
RDS SQL 2017 R5.8XLarge	1.5 TB	IO1	9000
RDS SQL 2017 R5.8XLarge	4.5 TB	GP2
RDS SQL 2017 R5.8XLarge	4.5 TB	IO1	30,000
RDS SQL 2017 R5.24XLarge	1.5 TB	GP2
RDS SQL 2017 R5.24XLarge	1.5 TB	IO1	9000
RDS SQL 2017 R5.24XLarge	4.5 TB	GP2
RDS SQL 2017 R5.24XLarge	4.5 TB	IO1	35,000

In your environment, you should also consider testing with the newer R5b instance types. For more information, see New – Amazon EC2 R5b Instances Provide 3x Higher EBS Performance.

Set up test cases

We tested the following configurations for performing backup and restore for 1 TB and 4 TB databases:

Backup and restore with GP2 storage
Backup and restore with compression enabled and GP2 storage
Backup and restore with compression enabled, multiple files (four), and GP2 storage
- The backup file was divided into four multiple files:

exec msdb.dbo.rds_backup_database 
@source_db_name='mydatabase',
@s3_arn_to_backup_to='arn:aws:s3:::mybucket/backup*.bak',
@type='FULL', 
@number_of_files=4;

Backup and restore with compression enabled, multiple files (eight), and GP2 storage
- The backup file was divided into eight multiple files:

exec msdb.dbo.rds_backup_database 
@source_db_name='mydatabase',
@s3_arn_to_backup_to='arn:aws:s3:::mybucket/backup*.bak',
@type='FULL',
@number_of_files=8;

Backup and restore with multiple files (four) and GP2 storage
Backup and restore with multiple files (eight) and GP2 storage
Backup and restore with io1 storage
Backup and restore with compression enabled and io1 storage
Backup and restore with compression enabled, multiple files (four), and io1 storage
Backup and restore with compression enabled, multiple files (eight), and io1 storage
Backup and restore with multiple files (four) and io1 storage
Backup and restore with multiple files (eight) and io1 storage

Test results

The following table summarizes the results we found for the 1 TB database.

(All the numbers represent minutes)	Full Backup Times (1 TB)			Full Restore Times (1 TB)
(All the numbers represent minutes)	r5x8large	r5x24large	Performance	r5x8large	r5x24large	Performance
GP2	179	175	x	247	273	x
GP2+Compression	90	89	~2x faster	138	127	~2x faster
GP2+Compression+Multiple Files (4)	66	66	~2.7x faster	79	79	~3x faster
GP2+Compression+Multiple Files (8)	66	66	~2.7x faster	82	79	~3x faster
GP2+Multiple Files (4)	65	65	~2.7x faster	73	74	~3x faster
GP2+ Multiple (8)	65	71	~2.7x faster	78	75	~3x faster
IOPS (9,000)	172	177	~same performance	175	383
IOPS+Compression	85	87	~2.1x faster	82	124	~3x faster
IOPS+Compression+Multiple Files (4)	33	34	~5.4x faster	37	40	~6.5x faster
IOPS+Compression+Multiple Files (8)	33	33	~5.4x faster	36	37	~6.5x faster
IOPS+Multiple Files (4)	55	57	~3.2x faster	50	76	~4x faster
IOPS+ Multiple (8)	49	53	~3.5x faster	48	68	~4x faster

The following table summarizes the results we captured for the 4 TB database.

(All the numbers represent minutes)	Full Backup Times (4 TB)			Full Restore Times (4 TB)
(All the numbers represent minutes)	r5x8large	r5x24large	Performance	r5x8large	r5x24large	Performance
GP2	542	537	x	740	763	x
GP2+Compression	223	224	~2.4x faster	281	280	~2.6x faster
GP2+Compression+Multiple Files (4)	202	201	~2.5x faster	281	281	~2.6x faster
GP2+Compression+Multiple Files (8)	204	202	~2.5x faster	281	–	~2.6x faster
GP2+Multiple Files (4)	202	201	~2.5x faster	281	–	~2.6x faster
GP2+ Multiple (8)	208	208	~2.5x faster		–
IOPS (30,000)	540	–	~same performance	826	757
IOPS (30,000)+Compression	164	–	~3x faster	194	210	~3.6x faster
IOPS (30,000)+Compression+Multiple Files (4)	102	102	~5.3x faster	140	141	~5.2x faster
IOPS (30,000)+Compression+Multiple Files (8)	101	101	~5.3x faster	138	140	~5x faster
IOPS (30,000)+Multiple Files (4)	173	–	~3.1x faster	168	175	~4x faster
IOPS (30,000)+ Multiple (8)	134	–	~4x faster	141	141	~5.2x faster
IOPS (35,000)+Compression+Multiple Files (4)	94	93	~5.7x faster	129	129	~5.7x faster
IOPS (35,000)+Compression+Multiple Files (8)	93	93	~5.7x faster	129	130	~5.7x faster

Observations

Based on my results, we can make the following observations about backup and restore performance:

IOPS with compression and multiple files gives the best performance. This configuration could reduce the time taken for a backup and restore by more than five times. However, there are trade-offs for this performance:
- IOPS costs more than GP2.
- Compression requires CPU; it might affect other concurrent transactions running on SQL Server and cause overall performance degrade.
You can use multi-file backup to improve backup performance by approximately 2.5 times and restore performance by 2.6 times. By using this functionality, you don’t have change any configuration on your existing Amazon RDS for SQL Server, such as IOPS or enabling compression.
The reason we didn’t see a difference between R5.XLarge and R5.24XLarge is because we didn’t have any active workloads running on these machines. Backup and restore were the only processes running on SQL Server. Therefore, there was no contention of resources on SQL Server. In a real-world scenario, when the backup and restore process competes with other database transactional processes, a bigger machine (with more resources) gives a better performance.
IOPS configuration affects backup and restore performance. When we increased the IOPS from 30,000 to 35,000, performance improved. You should adjust the IOPS setting during the migration window to improve performance.

Summary

If you’re looking for ways to improve backup and restore time for your large databases, the results and the observations in this post should help you form a good strategy. The post lists some of the tools and techniques to reduce time taken for backup and restore of large databases. More importantly, it can help you get an estimate of the performance gain with each configuration, which you can use to plan migration for your large databases.

Try one of the configurations discussed in the post and leave a comment to let us know what results you saw in your environment!

About the Author

Anuja Malik has been working with database for 15 years. Ex-Microsoft, she was the Database Architect for Thomson Reuters where she led the datacenter exit program and helped Thomson Reuters move to AWS. Anuja is a builder; Powershell and Python are her favorite tools. She is passionate about managed database services and helping customers adopt best practices for operational excellence and cost savings.

AWS Database Blog