AWS Database Blog

Migrating to Amazon RDS for SQL Server using transactional replication with native backup and restore: Part 3

If you have large mission-critical workloads running on an on-premises Microsoft SQL Server database, you may be looking for ways to migrate to AWS with minimal to near-zero downtime. In this post, we show you a solution for migrating your on-premises SQL Server database to Amazon Relational Database (Amazon RDS) for SQL Server using the SQL Server native backup and restore method in combination with transactional replication to minimize downtime. This method is useful when you have to migrate all the database objects including views, stored procedures, and functions as part of the database migration. This solution is beneficial when the size of the SQL Server database is large. Furthermore, this approach helps you move data to Amazon RDS for SQL Server and allows you to take advantage of native tools using existing infrastructure.

In Part 1 and Part 2 of this series, we showed you how to configure transactional replication from on-premises Microsoft SQL Server or SQL Server on an Amazon Elastic Compute Cloud (Amazon EC2) instance to Amazon RDS for SQL Server. As mentioned in these posts, by default, a subscription to a transactional publication is initialized with a snapshot generated by the Snapshot Agent that contains database objects along with the data.

In scenarios where the size of the source database is large, you should initialize a subscription without using a snapshot. Instead, use the SQL Server native backup and restore feature to perform the initial sync.

Solution overview

The solution uses an EC2 instance with SQL Server database engine installation as the on-premises source database and Amazon RDS for SQL Server as the target database for migration.

To minimize downtime, we take a two-stage approach for database migration, as described in the following diagram. The initial stage uses SQL Server full and transaction log backup to restore the Amazon RDS database. This restores the database to a point in time, meaning that you won’t have data from transactions that happened after the source backup was taken. The second stage uses SQL Server transactional replication to replicate ongoing data changes from the on-premises database to Amazon RDS.

Prerequisites

To get started, you must have the following prerequisites:

  • An on-premises instance of SQL Server 2019 or an instance of SQL Server 2019 on Windows Server 2016 running on an Amazon EC2 host. We call this setup the “source SQL Server” for the rest of the post.
  • An RDS for SQL Server DB instance.
  • Connectivity between the source SQL Server and Amazon RDS for SQL Server.
  • A login on the source SQL Server that is part of a sysadmin fixed server role to configure a distributor on the source SQL Server.
  • Familiarity with SQL Server Management Studio (SSMS).

Configure distribution database on the source SQL Server

You can skip this section if the distribution is already configured on the source SQL Server. Complete the following steps to enable distribution:

  1. Log in to the source SQL Server using a login that is part of the sysadmin server role.
  2. In SQL Server Management Studio, navigate to the replication node.
  3. Choose the replication node (right-click) and choose Configure distribution.
  4. Follow the instructions in the section Configuring distribution referenced in the post Migrating to Amazon RDS for SQL Server using transactional replication: Part 2.

Configure publication on the source server

Your next step is to create a publication and include all the objects that you want to migrate to the Amazon RDS environment.

  1. Log in to the source SQL Server using SQL Server Management Studio.
  2. Open the context menu (right-click) for Local Publications under the replication node, then choose New Publication.
  3. Choose the database.

In the following screenshot, we chose the source database dms_sample.

  1. Choose Next.
  2. For Publication Type, choose Transactional publication.
  3. Choose all the objects you want to replicate and choose Next.
  4. Choose Next to skip filtering table rows.
  5. In the snapshot window, do not select any of the check boxes.

We don’t use Snapshot Agent to perform the initial data sync. Instead, we perform native backup and restore as shown in the subsequent steps.

  1. Choose Next.
  2. Configure the Snapshot Agent and log reader agent security as needed and create a publication.
  3. Create the publication and provide a publication name.
  4. After creating the publication, you need to run the following script to set the Allow_Initialize_From_Backup parameter to true at the publisher. You can either do this using T-SQL or SSMS. See the following code:
USE [dms_sample]
GO
EXEC sp_changepublication
@publication = N'OnPrem-RDS',
@property = N'allow_initialize_from_backup',
@value = 'true'
GO

Configure your RDS for SQL Server instance with native backup and restore

To set up your RDS for SQL Server instance for native backup and restore, you need three components:

  1. Take a full backup of the user database from your source SQL Server:
BACKUP DATABASE [dms_sample] TO DISK = 'C:\Backups\dms_sample.bak' WITH STATS= 10, 
COMPRESSION;
  1. Copy the backup to the S3 bucket. To perform this copy, you can either use the AWS Command Line Interface (AWS CLI), as shown in the following screenshot, or the Amazon S3 console.
  2. On the Amazon RDS instance, restore the full backup with no recovery:
exec msdb.dbo.rds_restore_database
@restore_db_name='dms_sample',
@s3_arn_to_restore_from='arn:aws:s3:::testsqlserver4321/dms_sample.bak',
@type='FULL',
@with_norecovery=1;
  1. Take a log backup of the user database from your source SQL Server:
BACKUP LOG [dms_sample] TO DISK = 'C:\Backups\dms_sample.trn' WITH STATS= 10, 
COMPRESSION;

This saves time in the sync process when the replication is being initialized.

  1. Copy the log backup over to the S3 bucket. To perform this copy, you can either use the AWS CLI, as shown in the following screenshot, or the Amazon S3 console.
  2. On the Amazon RDS instance, restore the log backup with recovery to bring the database online:
exec msdb.dbo.rds_restore_log
@restore_db_name='dms_sample',
@s3_arn_to_restore_from='arn:aws:s3:::testsqlserver4321/dms_sample.trn',
@with_norecovery=0; 

If a row is inserted by a user at the publisher and that insert satisfies a constraint on the table, it might not be required to enforce the same constraint when the row is inserted by a replication agent at the subscriber. The same is applicable for triggers.

  1. Disable all triggers and constraints on the subscriber database in Amazon RDS for SQL Server:
USE [dms_sample]                                                                         
GO
--Disable all triggers on the subscriber RDS database:
SELECT 'ALTER TABLE ' + SCHEMA_NAME(schema_id)+ '.' + [name] + ' DISABLE TRIGGER ALL' FROM SYS.TABLES WHERE TYPE = 'U';
--Disable all constraints on the subscriber RDS database:
SELECT 'ALTER TABLE ' + SCHEMA_NAME(schema_id)+ '.' + [name] + ' NOCHECK CONSTRAINT ALL' FROM SYS.TABLES WHERE TYPE = 'U';

We can’t use the GUI to create the subscription because we’re initializing from backup and these options aren’t available in SSMS. Instead, we need to use T-SQL.

  1. Create a push subscription on the publisher by running the following query on the source SQL Server instance:
-- Add the Subscription on the Publisher 
USE dms_sample
GO

EXEC sp_addsubscription
@publication = N'OnPrem-RDS',
@subscriber = N'rds-sqlserver.XXXXX.us-east-1.rds.amazonaws.com', -- RDS Endpoint
@destination_db = N'dms_sample',
@sync_type = N'initialize with backup',
@backupdevicetype= 'Disk',
@backupdevicename='C:\Backups\dms_sample.trn', --Location of the T-Log backup
@subscription_type = N'push',
@update_mode = N'read only',
@article = N'all',
@subscriber_type = 0
GO
 
EXEC sp_addpushsubscription_agent @publication = N'OnPrem-RDS'
,@subscriber = N'rds-sqlserver.XXXXX.us-east-1.rds.amazonaws.com'
,@subscriber_db = N'dms_sample'
,@job_login = NULL
,@job_password = NULL
,@subscriber_security_mode = 0
,@subscriber_login = N'admin'   --SQL Login
,@subscriber_password = N'XXXX'  
,@frequency_type = 64
,@frequency_interval = 0
,@frequency_relative_interval = 0
,@frequency_recurrence_factor = 0
,@frequency_subday = 0
,@frequency_subday_interval = 0
,@active_start_time_of_day = 0
,@active_end_time_of_day = 235959
,@active_start_date = 20210319
,@active_end_date = 99991231
,@enabled_for_syncmgr = N'False'
,@dts_package_location = N'Distributor'
GO
  1. Monitor the replication status using the replication monitor at the publisher.

Additionally, you can view undistributed commands using the following query:

USE distribution; 
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT DISTINCT
@@SERVERNAME
,A.Article
,P.Publication
,Agents.[name]
,UndelivCmdsInDistDB
,DelivCmdsInDistDB,  
UndelivCmdsInDistDB + DelivCmdsInDistDB  AS TotalTrans
FROM dbo.MSdistribution_status
AS s
INNER JOIN dbo.MSdistribution_agents AS Agents ON Agents.[id] = S.agent_id
INNER JOIN dbo.MSpublications AS P ON P.publication = Agents.publication
INNER JOIN dbo.MSarticles AS A ON A.article_id = S.article_id and P.publication_id = A.publication_id
 WHERE UndelivCmdsInDistDB <> 0 AND Agents.subscriber_db NOT LIKE 'virtual'
ORDER BY UndelivCmdsInDistDB DESC
OPTION(RECOMPILE);

Cut over to Amazon RDS for SQL Server

When the source and target databases are in sync, stop activity on your source SQL Server and ensure that replication is complete (there are no undistributed commands in the distribution database). Then you can cut over your production workloads to the RDS for SQL Server instance. You can use the following code to clean up replication and enable the constraints and triggers that were previously disabled:

USE dms_sample
GO
DECLARE @publication AS sysname;
DECLARE @subscriber AS sysname;
SET @publication = N'OnPrem-RDS';
SET @subscriber = N'rds-sqlserver.XXXXX.us-east-1.rds.amazonaws.com';
 
USE [dms_sample]
EXEC sp_dropsubscription
  @publication = @publication,
  @article = N'all',
  @subscriber = @subscriber;
GO
 
EXEC sp_subscription_cleanup
    @publisher = N'EC2AMAZ-15OM00V', --Hostname of Source SQL Server 
    @publisher_db = N'dms_sample',
    @publication = N'OnPrem-RDS';

DECLARE @publicationDB AS sysname;
DECLARE @publication AS sysname;
SET @publicationDB = N'dms_sample';
SET @publication = N'OnPrem-RDS';
 
-- Remove a transactional publication.
USE [dms_sample]
EXEC sp_droppublication @publication = @publication;
 
-- Remove replication objects from the database.
USE [master]
EXEC sp_replicationdboption
  @dbname = @publicationDB,
  @optname = N'publish',
  @value = N'false';
GO 

--Enable all triggers on the subscriber RDS database:
SELECT 'ALTER TABLE ' + SCHEMA_NAME(schema_id)+ '.' + [name] + ' ENABLE TRIGGER ALL' FROM SYS.TABLES WHERE TYPE = 'U';

--Enable all constraints on subscriber RDS database:
SELECT 'ALTER TABLE ' + SCHEMA_NAME(schema_id)+ '.' + [name] + ' WITH CHECK CHECK CONSTRAINT all' FROM SYS.TABLES WHERE TYPE = 'U';

Summary

In this post, we showed you how to configure transactional replication with native backup and restore that replicates data from an on-premises SQL Server or SQL Server on an EC2 instance. You can use this strategy to migrate your large mission-critical workloads to an RDS for SQL Server instance with minimal to near-zero downtime.

If you have any comments or questions, please share them in the comments section.


About the Author

Sid Vantair is a Cloud Support Engineer with Amazon Web Services. In his spare time, he loves to spend time with his family and exploring new travel destinations.