How to migrate a large-scale SQL Server database over low bandwidth using AWS Application Migration Service

In this blog post, we will demonstrate how to migrate a Microsoft SQL Server database on a low bandwidth network using AWS Application Migration Service (AWS MGN), AWS Snowcone, and SQL Server Log Shipping technology.

Customers use different migration strategies to lift-and-shift their on-premises SQL Server workloads to the AWS Cloud. One of those approaches is to use AWS MGN, which uses block-level storage replication from the source server into the target server within an AWS account. When you’re ready for cutover, it automatically converts and launches your servers, reducing error-prone manual processes, costs, and downtime.

The AWS Snow Family (comprised of AWS Snowcone, AWS Snowball, and AWS Snowmobile) helps you transport terabytes or petabytes of data into and out of AWS with integrated AWS security, monitoring, storage management, and computing capabilities. Typically, if you are attempting to migrate SQL Server databases with terabytes of data using AWS MGN, you need a significant amount of network bandwidth. However, there are circumstances in which you will not have access to the optimal amount of network bandwidth to meet the requirements.

In this blog post, we will show you how to do these large-scale SQL Server migrations on a low bandwidth network. You can scale this solution and migrate terabytes and petabytes of data by selecting the appropriate AWS Snow device along with other AWS services, as we will discuss in the following solution. This hybrid migration solution will also help you save costs since your network bandwidth requirements will be reduced.

Solution overview

In this solution, we use a hybrid model to migrate both static data and ongoing changes:

For the initial load, we use an offline method to capture static data (AWS Snowcone).
For incremental changes, from the operating system and SQL Server metadata (system databases changes), we use an online method (AWS MGN).

Figure 1 illustrates the solution architecture:

Figure 1 – Solution Architecture

For this solution, we will perform a native full backup of SQL Server database from an on-premises database server (Node 1) to an AWS Snowcone device. This AWS Snowcone device backup will be copied to Amazon Simple Storage Service (Amazon S3) and then restored to a staging Amazon EC2 server (Node 3) to separate Amazon Elastic Block Store (Amazon EBS) volumes. Next, we will use AWS DataSync to transfer the transaction logs to the staging Amazon EC2 instance. The transaction logs will be synced to an Amazon FSx for Windows File Server attached to the staging Amazon EC2 instance. We will create custom log shipping jobs to continuously backup and restore database changes from on premises (Node 1) to the Amazon EC2 staging SQL Server (Node 3). Once we are ready to cut over, we will recover the database on the staging SQL Server, then copy and attach the Amazon EBS volumes with database to the Amazon EC2 MGN cutover server.

You should consider creating multiple resources for each AWS service, as needed, to avoid reaching AWS service quotas.

The following are the high-level steps for the solution:

Set up and configure AWS MGN to replicate volumes from on premises (Node 1) to AWS MGN replication server (Node 2).
Transfer on-premises full backup to Amazon S3 using AWS Snowcone.
Set up AWS DataSync for file transfers.
Set up custom log shipping from Node 1 to Node 3 to replicate transaction logs.
Migrate the on-premises database server to the destination Amazon EC2 server.

Prerequisites

The following prerequisites are required:

An AWS account and appropriate permissions to interact with resources in your AWS account.
An Amazon Virtual Private Cloud (Amazon VPC) to create AWS resources.
Network connectivity from on premises (Node 1) to AWS.
On-premises SQL Server database running on Microsoft Windows:
- Source database should be in full recovery model to permit log backups.
- User database files should reside on their own discrete drives.
Create an AWS Managed or self-managed Active Directory for Microsoft Windows authentication.
Map an Amazon FSx file system to the Amazon EC2 staging SQL Server (Node 3) for copying backup files.

Walkthrough

Step 1: Set up and configure AWS MGN to replicate volumes from on-premises (Node 1) to AWS MGN replication server (Node 2)

Before starting with the AWS MGN agent installation, ensure the on-premises source servers meet the requirements and generate the required credentials. Download the agent installer for the AWS Region and start the installation process. During the installation, the agent identifies the volumes in the source server and prompts for the disks to replicate. Choose the operating system drive (C:) and SQL Server binary drive (D:) for AWS MGN replication, as shown in Figure 2:

Figure 2 – Command prompt configuring AWS MGN replication

Step 2: Transfer on-premises full backup to Amazon S3 using AWS Snowcone

Create a SQL Server authentication login and grant sysadmin privilege on Node 1 for metadata to be replicated through AWS MGN. You can use this SQL Server login during the migration window to connect to the cutover instance.

These are the steps needed to transfer on-premises full backup to Amazon S3:

Perform a full backup of the on-premises SQL Server database to the on-premises backup drive.For this use case, we have created a sample database called “db1” and table called “testing” with 1 record to demonstrate the approach.
Depending on the backup size, you can choose AWS Snowcone SSD, which offers 14TB SSD storage, or an AWS Snowball Edge storage-optimized device, which offers 210TB of storage space.
Create an AWS Snow Family job by choosing the “import to Amazon S3” option, device type, S3 bucket(s), security, shipping, and notification preferences.
Once you receive the device, use AWS OpsHub to unlock the device to set up Amazon S3 compatible storage on the AWS Snow device.
Copy the backup file(s) into the device. Refer to Figure 3.
Power off and return the device to AWS.
Once the device is at the AWS facility, AWS will import the data into the previously chosen Amazon S3 bucket.
Using the AWS Management Console, monitor the AWS Snow Family job periodically to know about device shipping status and data import to Amazon S3 bucket status.
Review the device completion report and logs for more details.

Figure 3 - On-premises full backup file copied on the AWS Snowcone device

Figure 3 – On-premises full backup file copied on the AWS Snowcone device

Step 3: Set up AWS DataSync for file transfers

AWS DataSync will be used to transport the full database backup from Amazon S3 into Amazon FSx for Windows File Server (mounted on Node 3), as shown in Figure 4. AWS DataSync will also help transfer the transaction log backups from on premises to the Amazon FSX for Windows File Server on a continuous basis, as shown in Figure 5.

These are the steps needed to set up AWS DataSync for file transfers:

Create an agent, depending on the hypervisor type, and activate it.
Add the backup/file server as the source location.
Create the destination location for Amazon FSX file server.
Configure the transfer tasks (one task to copy full backup file and another task to copy transaction log backup files) by selecting the source and destination locations along with other configuration options.
Schedule the tasks and set up bandwidth as required.

Figure 4 – AWS DataSync task to copy full backup file

Figure 5 – AWS DataSync task to copy transaction log backup files

Use the following T-SQL code to restore full backup of database ‘db1’ on Node 3 from Amazon FSx for Windows File Server, as shown in Figure 6:

Restore database db1 from disk= '<path>\db_full.bak' with norecovery

6 - Restoring db1 database

Figure 6 – Restoring db1 database

Step 4: Set up custom log shipping from Node 1 to Node 3 to replicate transaction logs

SQL Server log shipping allows you to send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances through SQL Server agent jobs (LSBackup, LSCopy, LSRestore, LSAlert). Refer to About Log Shipping for more details. In our use case, we primarily have two SQL Server jobs:

Backup job – Runs on the primary instance (Node 1) and takes the transaction log backup to on-premises backup location.
Restore job – Runs on the secondary instance (Node 3) and restores the copied log backup file(s) on to secondary instance (Node 3).

Since the primary database is on premises and the secondary one is on AWS, it is ideal to use Amazon DataSync for copying log backup file(s) from Node 1 to Node 3, replacing the LSCopy job’s functionality. Amazon FSX for Windows File Server can be used as a file share on Node 3 to hold transaction log backup files. Do not lose any transaction log backup files that were taken after the AWS Snowcone’s full backup from Node 1 as it will break the log restore chain, if we need to restart the process.Use T-SQL to set up backup and restore jobs.

Note that the backup job, as well as the restore job, will create a corresponding LSAlert job in the source server (as shown in Figure 7) and destination server (as shown in Figure 8). You can configure a job failure notification for this job for immediate attention. For more information, refer to Notify an Operator of Job Status.

Figure 7 - LSAlert job in Node 1

Figure 7 – LSAlert job in Node 1

Figure 8 - LSAlert job in Node 3

Figure 8 – LSAlert job in Node 3

Backup Job T-SQL:

This log shipping job runs on the primary Node 1 instance. It takes the transaction log backups to the on-premises backup location (your file share location) to capture the ongoing/incremental changes to the source database.

Run the following T-SQL code on Node 1 to create the LSBackup job:

DECLARE @LS_BackupJobId AS uniqueidentifier
DECLARE @LS_PrimaryId AS uniqueidentifier
DECLARE @SP_Add_RetCode As int
EXEC @SP_Add_RetCode = master.dbo.sp_add_log_shipping_primary_database
@database = N'db1'
,@backup_directory = N'\\<source_server_IP_address>\backup\tlog'
,@backup_share = N'\\<source_server_IP_address>\backup\tlog'
,@backup_job_name = N'LSBackup_db1'
,@backup_retention_period = 7200
,@backup_compression = 1
,@backup_threshold = 120
,@threshold_alert_enabled = 1
,@history_retention_period = 5760
,@backup_job_id = @LS_BackupJobId OUTPUT
,@primary_id = @LS_PrimaryId OUTPUT
,@overwrite = 1
IF (@@ERROR = 0 AND @SP_Add_RetCode = 0)
BEGIN
DECLARE @LS_BackUpScheduleUID As uniqueidentifier
DECLARE @LS_BackUpScheduleID AS int
EXEC msdb.dbo.sp_add_schedule
@schedule_name =N'LSBackupSchedule_<source_server_IP_address>'
,@enabled = 1
,@freq_type = 4
,@freq_interval = 1
,@freq_subday_type = 4
,@freq_subday_interval = 15
,@freq_recurrence_factor = 0
,@active_start_date = 20220909
,@active_end_date = 99991231
,@active_start_time = 0
,@active_end_time = 235900
,@schedule_uid = @LS_BackUpScheduleUID OUTPUT
,@schedule_id = @LS_BackUpScheduleID OUTPUT
EXEC msdb.dbo.sp_attach_schedule
@job_id = @LS_BackupJobId
,@schedule_id = @LS_BackUpScheduleID
EXEC msdb.dbo.sp_update_job
@job_id = @LS_BackupJobId
,@enabled = 1
END
EXEC master.dbo.sp_add_log_shipping_alert_job

Restore Job T-SQL:

This log shipping job runs on the secondary Node 3 instance. It restores the copied log backup file(s) onto the secondary Node 3 instance to apply the ongoing/incremental changes from the source database.

Run the following T-SQL code on Node 3 to create the LSRestore job:

DECLARE @LS_Secondary__CopyJobId AS uniqueidentifier
DECLARE @LS_Secondary__RestoreJobId AS uniqueidentifier
DECLARE @LS_Secondary__SecondaryId AS uniqueidentifier
DECLARE @LS_Add_RetCode As int
EXEC @LS_Add_RetCode = master.dbo.sp_add_log_shipping_secondary_primary
@primary_server = N'<staging_server_IP_address>'
,@primary_database = N'db1'
,@backup_source_directory = N'\\<FSx_IP_address>\share\incremental'
,@backup_destination_directory = N'\\<FSx_IP_address>\share\incremental'
,@copy_job_name = N'LSCopy_<staging_server_IP_address>_db1'
,@restore_job_name = N'LSRestore_<staging_server_IP_address>_db1'
,@file_retention_period = 259200
,@overwrite = 1
,@copy_job_id = @LS_Secondary__CopyJobId OUTPUT
,@restore_job_id = @LS_Secondary__RestoreJobId OUTPUT
,@secondary_id = @LS_Secondary__SecondaryId OUTPUT
IF (@@ERROR = 0 AND @LS_Add_RetCode = 0)
BEGIN
DECLARE @LS_SecondaryCopyJobScheduleUID As uniqueidentifier
DECLARE @LS_SecondaryCopyJobScheduleID AS int
EXEC msdb.dbo.sp_add_schedule
@schedule_name =N'DefaultCopyJobSchedule'
,@enabled = 1
,@freq_type = 4
,@freq_interval = 1
,@freq_subday_type = 4
,@freq_subday_interval = 15
,@freq_recurrence_factor = 0
,@active_start_date = 20221001
,@active_end_date = 99991231
,@active_start_time = 0
,@active_end_time = 235900
,@schedule_uid = @LS_SecondaryCopyJobScheduleUID OUTPUT
,@schedule_id = @LS_SecondaryCopyJobScheduleID OUTPUT
EXEC msdb.dbo.sp_attach_schedule
@job_id = @LS_Secondary__CopyJobId
,@schedule_id = @LS_SecondaryCopyJobScheduleID
DECLARE @LS_SecondaryRestoreJobScheduleUID As uniqueidentifier
DECLARE @LS_SecondaryRestoreJobScheduleID AS int
EXEC msdb.dbo.sp_add_schedule
@schedule_name =N'DefaultRestoreJobSchedule'
,@enabled = 1
,@freq_type = 4
,@freq_interval = 1
,@freq_subday_type = 4
,@freq_subday_interval = 15
,@freq_recurrence_factor = 0
,@active_start_date = 20221001
,@active_end_date = 99991231
,@active_start_time = 0
,@active_end_time = 235900
,@schedule_uid = @LS_SecondaryRestoreJobScheduleUID OUTPUT
,@schedule_id = @LS_SecondaryRestoreJobScheduleID OUTPUT
EXEC msdb.dbo.sp_attach_schedule
@job_id = @LS_Secondary__RestoreJobId
,@schedule_id = @LS_SecondaryRestoreJobScheduleID
END
DECLARE @LS_Add_RetCode2 As int
IF (@@ERROR = 0 AND @LS_Add_RetCode = 0)
BEGIN
EXEC @LS_Add_RetCode2 = master.dbo.sp_add_log_shipping_secondary_database
@secondary_database = N'db1'
,@primary_server = N'<staging_server_IP_address>'
,@primary_database = N'db1'
,@restore_delay = 0
,@restore_mode =1
,@disconnect_users = 0
,@restore_threshold = 120
,@threshold_alert_enabled = 1
,@history_retention_period = 5760
,@overwrite = 1
END
IF (@@error = 0 AND @LS_Add_RetCode = 0)
BEGIN
EXEC msdb.dbo.sp_update_job
@job_id = @LS_Secondary__CopyJobId
,@enabled = 0
EXEC msdb.dbo.sp_update_job
@job_id = @LS_Secondary__RestoreJobId
,@enabled = 1
END

Step 5: Migrate the on-premises database server to the destination Amazon EC2 server

Up to this step, you have completed the setup to replicate the database server and its changes from Node 1 to Node 3. It’s now time to launch the test cutover instance from AWS MGN to test the source server’s functionality within the AWS environment. From an AWS MGN standpoint, validate the test and cutover indicators.

Once you are ready for the cutover:

1. Stop all incoming application traffic to Node 1 so no write activities are occurring.

2. Disable backup job from Node 1 (as shown in Figure 9) and restore job from Node 3 (as shown in Figure 10).

Figure 9 - Backup job disabled status

Figure 9 – Backup job disabled status

Note that LSCopy job gets created as part of the restore job creation, but it will be disabled and won’t be used in this migration (as shown in Figure 10).

Figure 10 - Restore Job disabled status

Figure 10 – Restore Job disabled status

3. Monitor log shipping job status and validate if all the transaction logs are restored.

Run the following T-SQL code on Node 3 to check the transaction log restore status:

SELECT b.type, b.first_lsn, b.last_lsn, b.checkpoint_lsn, b.database_backup_lsn, a.*
FROM msdb..restorehistory a
INNER JOIN msdb..backupset b ON a.backup_set_id = b.backup_set_id
WHERE a.destination_database_name = '<Db_name>'
ORDER BY restore_date DESC

If all the transaction logs are not restored, manually re-run the backup job (Node 1), data sync job (AWS console) and restore job (Node 3) or wait for its scheduled run to catch up.
Make a note of the record count of a source table on Node 1.

Run the following T-SQL code to get a record count of a source table on Node 1:

Use <Db_name>
go
Select count (*) from <source_table_name>

4. Initiate failover to Node 3 by recovering the secondary database.

Run the following T-SQL code on Node 3 to recover the database, as shown in Figure 11:

Restore log <db1> with recovery

Figure 11 – Recovering db1 database

5. Launch the cutover machine from AWS MGN (Amazon EC2 MGN cutover server – Node 4) in the same Availability Zone as the Amazon EC2 staging SQL Server (Node 3).

6. Back up the database on Node 3 for additional redundancy.

7. Make a note of the database file locations from Node 3 using the following T-SQL code:

select name as logical_name, db_name(dbid) as db_name, filename from sysaltfiles where db_name(dbid)='db1'.

8. To prevent data corruption during the migration, detach the user database on Node 3 by running the following T-SQL code. Refer to Figure 12.

USE [master]
GO
ALTER DATABASE [db1] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
GO

EXEC master.dbo.sp_detach_db @dbname = N'db1', @skipchecks = 'false'

Figure 12 - Detaching db1 database

Figure 12 – Detaching db1 database

9. Shut down/manually stop Node 3 to prevent write activities to the instance.

10. Note that to detach Amazon EBS volumes (X, Y, Z user database drives) from Amazon EC2 staging SQL Server (Node 3) and attach them to the Amazon EC2 MGN cutover server (Node 4), both nodes need to be in the same Availability Zone. If they are in same Availability Zone, then perform the following:

Detach Amazon EBS volumes (X, Y, Z user database drives) from Amazon EC2 staging SQL Server (Node 3).
Attach Amazon EBS volumes (X,Y,Z database drives) to Amazon EC2 MGN cutover server (Node 4).

11. If they are in different Availability Zones, then perform the following:

Create snapshots from the Node 3 Amazon EBS volumes (X,Y, Z user database drives).
Create Amazon EBS volumes from the snapshots (as shown in Figure 13) taken in the previous step. Note that when an Amazon EBS volume is created, it should be created in the Availability Zone of the Amazon EC2 MGN cutover server Node 4.

Figure 13 – Create EBS Volume

Attach Amazon EBS volumes (X,Y,Z database drives) to Amazon EC2 MGN cutover server (Node 4).

12. When you connect to the cutover instance through SQL Server Management Studio (SSMS), you will see the database in a recovery pending state. This is because the SQL Server instance only has the metadata, but not the associated database files yet. You can clean up the metadata by running the following T-SQL code:

Drop database <db_name>

13. You can now attach a user database by running the following T-SQL code. Refer to Figure 14.

CREATE DATABASE [db1] ON
(FILENAME=N'X:\sql_db_files\db1.mdf'),
(FILENAME=N'X:\sql_db_files\db1_log.ldf')
For ATTACH

Figure 14 - Attaching db1 database

Figure 14 – Attaching db1 database

14. This is a successful migration if the record count for the destination table on the Amazon EC2 MGN cutover server (Node 4) matches the record count for the source table on the on-premises node (Node 1).

Run the following T-SQL code to get the record count from Node 4:

Use <Db_name>
go
Select count (*) from <destination_table_name>

15. Redirect your application, dependent services, traffic by modifying your connection string to use the Amazon EC2 MGN cutover server (Node 4) so that the write activities can resume from this instance.

Cleanup

To avoid incurring future charges, delete the resources you created as part of this blog post:

Conclusion

AWS MGN is a simplified, highly automated, lift-and-shift (rehost) solution for migrating servers into AWS.AWS Snow Family, helps you transport terabytes or petabytes of data into and out of AWS.

In this blog post, we combined the simplicity of AWS MGN, scalability of AWS Snow Family and flexibility of SQL Server logshipping, leveraging online and offline migration capabilities and demonstrated a hybrid solution on how to migrate a SQL Server database on a low bandwidth network.

AWS has significantly more services, and more features within those services, than any other cloud provider, making it faster, easier, and more cost effective to move your existing applications to the cloud and build nearly anything you can imagine. Give your Microsoft applications the infrastructure they need to drive the business outcomes you want. Visit our .NET on AWS and AWS Database blogs for additional guidance and options for your Microsoft workloads. Contact us to start your migration and modernization journey today.

Microsoft Workloads on AWS