Microsoft Workloads on AWS

How to set up disaster recovery for SQL Server Always On Availability Groups using AWS Elastic Disaster Recovery

Traditional disaster recovery methods for Microsoft SQL Server require duplication of infrastructure, operating systems, licensing and other third-party applications, rendering it a highly expensive IT project – even prohibitively expensive for some organizations.

AWS Elastic Disaster Recovery (AWS DRS) provides organizations with a more modern approach to protecting Microsoft SQL Server environments: cloud-based disaster recovery into AWS. AWS DRS not only leverages the cloud to significantly reduce disaster recovery infrastructure costs, it also enables recovery of Microsoft SQL Server by replicating the operating system and all of the database, application, and system files running on the machine.

In this post, we will demonstrate how to set up a disaster recovery (DR) solution for SQL Server Always On availability groups (AG) deployed on Amazon Elastic Compute Cloud (EC2) platform to another AWS Region. You can use similar steps to set up DR from on premises to AWS using DRS.

Why would you want to use AWS DRS? Using AWS DRS means:

  • You don’t have to pay any SQL Server licensing costs until you fail over.
  • You can BYOL or use a LI instance for SQL Server in the DR region.
  • You can utilize a single DR solution for SQL Server and other workloads.

Solution Overview

As outlined in this prescriptive guidance, the diagram in Figure 1 illustrates the procedure for replicating and protecting an on-premises SQL Server database to AWS using AWS DRS.

AWS Elastic Disaster Recovery Architecture

Figure 1: AWS Elastic Disaster Recovery Architecture

AWS DRS can protect individual nodes of the SQL Server Always On availability group; however, when brought up on AWS, the DR copy of the cluster will be launched as a group of individual SQL Server instances.

Note: You need to make sure Active Directory exists in AWS for your Always On availability group setup. To learn more about how to protect a domain-joined workload with AWS DRS, please read this blog post.

This solution will work for both the SQL Server Standard edition and SQL Server Enterprise edition for any supported version of SQL Server.

Prerequisites

  • A VPC in a secondary Region
  • Two availability zones (AZs) in a secondary Region with at least a subnet in each AZ
  • Active directory domain available in secondary Region

Here are the high-level steps for the proposed solution:

  1. Set up AWS Elastic Disaster Recovery
  2. Install AWS DRS agent on source machines
  3. Replicate using AWS DRS
  4. Test failover at DR site

Set up AWS Elastic Disaster Recovery

As outlined in the AWS DRS Getting Started section:

  1. Log into the AWS DRS Console.
  2. Configure Replication Settings.

Install AWS DRS agent at source machines

  1. Generate required AWS credentials for agent install.
  2. Install agent on source SQL Server (Linux or Windows) server.

Agent installation process captured in Figure 2.

Figure 2 DRS Agent installation setup

Figure 2: DRS Agent installation setup

Replication using AWS DRS

Verify that the server is added to the AWS DRS console. If the server is successfully added, on the console you will see an ETA for the server to be replicated to AWS. Based on the size of the disk(s) and available network bandwidth, the console will provide the estimated time to migrate, e.g., in this case, a 100G disk took 2 hours for the initial replication.

Figure 3 DRS Replication initialization

Figure 3: AWS DRS Replication initialization

Test Failover at DR Site

Figure 4 Test Failover

Figure 4: Test Failover

When you are ready for testing, select the source machine that you have configured and update the launch settings for the target instance. Since you will be testing the replication to validate the process and not the performance, you will first change the blueprint to select cost-optimized components, then you will modify different destination options, such as instance type, target subnet, EBS volume type, network settings etc.

1. You can modify the launch settings and update the EC2 launch template to change instance type, target subnet, EBS volume types, network settings, etc.

EC2 Launch template

Figure 5: EC2 Launch template

Note: When performing a drill, consider using a launch template that isolates your instances by using a separate subnet or different security groups to avoid network conflicts. If launching a recovery, due to a DR event, ensure that you have selected the proper version of the launch template that meets your production requirements.

2. You can modify disk type and change to gp3/io1 or launch type to dedicated host/dedicated instance/on-demand server. There is a price and performance consideration with each storage type for production implementation, so please select the option based on your requirements.

3. Once all settings are updated, you can launch the target instance.

Initiate drill

Figure 6 : Initiate drill

4. Select Initiate Drill under Initiate recovery job and select the appropriate Point in Time snapshot. View job details and check the progress of the job.

View job details

View recovery job details

Figure 7: View recovery job details

3. After successful completion of the Job status, login to the AWS Console in the DR Region and validate the EC2 SQL Server Primary and Secondary servers.

Check EC2 instances at DR region

Figure 8: Check EC2 instances at DR region

4. You can also check the status of recovery instance.

Recovery server status

Recovery instance details

Figure 9: Recovery instance details

5. Add your IP address or the IP address of Bastian host in Security group to allow RDP to this new instance.You can RDP and make connection to Primary SQL Server instance.

6. You can connect to the DR region host using the same username and password as the Primary server. If you are using Windows authentication, you can use the same username and password to connect to new server.

7. Open SQL Server Management Studio and connect to the local SQL Server database.

Check SQL Server instance at Secondary region

Figure 10: Check SQL Server instance at Secondary region

Check SQL Server instance at Secondary region.

8. After you have created the initial project and started replication, you can add some data on Primary server. For example, we have updated the city with “Test city” in some rows;the same incremental changes are replicated to your disaster recovery instance. When you launch the new instance as of point in time, as you can see the most recent data changed in the source SQL Server tables are also available in destination SQL Server tables. E.g. updated rows with city name column are replicated to secondary server, Block level replication incrementally synchronized these changes with the target.

Run queries to validate data using SSMS

Figure 11: Run queries to validate data using SSMS

9. You can connect to the DR server using SQL Server Management Studio and verify that all databases, logins, jobs, master, and MSDB objects are replicated. Here is the DR Server screenshot:

Verify SQL Server objects are replicated

Figure 12: Verify SQL Server objects are replicated

10. Here is the source region Primary server screenshot. As you can see, the IP address of both servers are different, but the same instances/databases are replicated on the DR server.

Check Primary server ip address and database status

Figure 13: Check Primary server ip address and database status

You now have a secondary server ready with a SQL Server installation, but you need to set up the high availability (HA) for the secondary server. After the migration, SQL Server on EC2 will work as a standalone SQL Server node.

After the SQL nodes are replicated to AWS, you will need to deploy the EC2 instances and reconfigure the IP addresses for the SQL Always On Listener. Follow our documentation and article about configuring SQL Server Always On in AWS.

Licensing

When migrating Windows machines into AWS, you can Bring Your Own (BYOL) Windows License by setting up a Dedicated Host in the EC2 launch template. When this is done, the licenses from the source machine will be automatically transferred to the target machine in AWS.

Conclusion:

You were able to successfully set up DR for SQL Server deployment with Always On availability group with a failover cluster in one Region using AWS Elastic Disaster Recovery in another Region. SQL Server databases were consistent without any data loss.

With AWS Elastic Disaster Recovery, you will save on licensing costs for both SQL Server and Windows Server until a DR event occurs.


AWS can help you assess how your company can get the most out of cloud. Join the millions of AWS customers that trust us to migrate and modernize their most important applications in the cloud. To learn more on modernizing Windows Server or SQL Server, visit Windows on AWS. Contact us to start your migration journey today.