Microsoft Workloads on AWS

How to deploy a SQL Server failover cluster with Amazon EBS Multi-Attach on Windows Server

Updated on February 15, 2024.

In this blog post, we’ll walk you through creating a Microsoft SQL Server failover cluster on Amazon Web Services (AWS) with Amazon Elastic Block Storage (Amazon EBS) Multi-Attach, using the recently introduced Amazon EBS Multi-Attach on io2 volumes with persistent reservations feature. We will also highlight the potential for cost savings this approach offers.

Introduction

Microsoft SQL Server provides two main high availability and disaster recovery strategies: Failover Cluster Instances (FCIs) and Availability Groups (AGs). Both options use the Windows Server Failover Cluster (WSFC) but have distinct purposes and features.

A Failover Cluster Instance (FCI) is a SQL Server instance installed across multiple nodes in a Windows Server Failover Clustering (WSFC). If one node fails, all SQL Server components including system databases, logins, agent jobs, and certificates fail over to another node. Deploying SQL Server FCIs require shared storage; previously, we had to rely on Amazon FSx for NetApp ONTAP or Amazon FSx for Windows File Server, as well as some third-party shared storage solutions.

Introduced on September 18, 2023, Amazon EBS io2 volumes with NVMe Reservations for Multi-Attach creates yet another possibility to implement shared storage to support Windows Server failover clusters. Amazon EBS io2 volumes are automatically replicated within their Availability Zone (AZ), offering 99.999% durability.

NVMe Reservations is a set of industry-standard storage fencing protocols that enable you to manage access to a block device shared between multiple instances. This is similar to SCSI Persistent Reservations (SCSI PR)  used in on-premises Storage Area Network (SAN) devices. To deploy Windows Server failover clusters using Amazon EBS io2 volumes, you must use the latest Windows drivers that translate SCSI reservation commands to NVMe reservation commands.

Amazon EBS Multi-Attach with NVMe Reservations allows the creation of SQL Server FCI with Amazon EBS io2 volumes as the shared storage on the Windows Server failover clusters.

Solution overview

In the architecture diagram shown in Figure 1, the SQL Server failover cluster uses Amazon EBS io2 Multi-Attach volume with persistent reservation for high availability.

Architecture Overview Figure 1 Architecture Overview

Prerequisites

Implementing a SQL Server failover cluster on Windows Server using Amazon EBS Multi-Attach requires the following prerequisites:

  1. Amazon Virtual Private Cloud (Amazon VPC) with at least one AZ and two private subnets in the same AZ.
  2. Security groups to ensure the secure flow of traffic between the instances deployed in the Amazon VPC.
  3. Two or more Amazon EC2 instances running Windows Server 2016 or later running in separate subnets within the same AZ. The instances should either be launched using an AMI released after August 2023 or have the latest drivers.
  4. An existing or new Active Directory (AD) deployment with network access to support the Windows Server failover cluster and SQL Server deployment. AD can be deployed using AWS Directory Service for Microsoft Active Directory or on Amazon EC2) using AWS Launch Wizard for Active Directory.
  5. An AD domain user with the necessary permissions to set up a failover cluster.
  6. An AD domain user for the SQL Server service account.
  7. Spread placement group that instances are launched in; this ensures that instances are each placed on distinct hardware.

Walkthrough

This walkthrough will show you how to use Amazon EBS io2 Multi-Attach volumes for SQL Server failover clusters in a single AZ

Step 1. Creating Amazon EC2 instances

  • Create two or more Amazon EC2 instances in separate subnets.
  • Under Advanced network configuration or on Manage IP addresses of an existing instance, assign two secondary IP addresses, as shown in Figure 2.

Secondary IP addresses Figure 2 Secondary IP addresses

Step 2 Provisioning and attaching Amazon EBS io2 volumes to your instances

  • Create the required Amazon EBS io2 volumes, ensuring that Multi-Attach is enabled for Cluster Quorum (at least 4 GiB), Data, Logs, and any additional storage needed.
  • Attach the volumes to the instances created in Step 1.

Step 3 Initialize and format volumes

  • After attaching the volume(s), RDP into one of the Windows Server EC2 instances as a user with permissions to Create Failover Cluster.
  • Open the Disk Management tool. You’ll see the attached volume as a raw, unallocated disk.
  • Initialize the disk.
  • Online the disk.
  • Create a new simple volume.
  • Format the volume using NTFS. Assign a drive letter.
  • Repeat for any additional Amazon EBS io2 volumes you’ve created.

Step 4 Configure SCSI persistent reservations:

  • On all Amazon EC2 instances used for the failover cluster, open Windows PowerShell ISE and select Run as Administrator. Make sure to open the Script Pane, where you will paste the scripts.
  • Enable SCSI persistent reservations with the following command, which sets the EnableSCSIPersistentReservations to a value of 1 in the registry:
$registryPath "HKLM:\SYSTEM\CurrentControlSet\Services\AWSNVMe\Parameters\Device" 
Set-ItemProperty -Path $registryPath -Name EnableSCSIPersistentReservations -Value 1

Step 5 Create failover cluster

  • In the Windows PowerShell ISE, install clustering features:
Install-WindowsFeature -Name RSAT-AD-Powershell,Failover-Clustering -IncludeManagementTools; 
  • Reboot instances after the PowerShell command completes.
  • After the reboot, open Windows PowerShell ISE on the first Amazon EC2 instance used for failover clustering and select Run as Administrator. Make sure to open up the Script Pane, where you will paste the scripts.
  • Run the cluster validation process:
test-cluster -node node1, node2
  • Review cluster validation report, ensuring that storage checks are successful. The validation report is created in the C:\Windows\Cluster\Reports directory on the failover cluster host.
  • From the AWS console, copy the 1st secondary IP addresses for each node and use them in the create cluster command:
New-Cluster -Name ClusterName -Node Node1, Node2 -StaticAddress Node1SecondaryIP, Node2SecondaryIP
  • During the SQL Server installation, the setup process will create the SQL Server network name. This may fail if the organizational unit (OU) where the cluster is located doesn’t have permission to create child objects. You can pre-stage the SQL Server network name or provide permissions to create a SQL Server network name.

Step 6 Install new SQL Server failover cluster

  • On the first Amazon EC2 instance used for failover clustering, run the SQL Server Setup and select a New SQL Server Failover Cluster Installation.
  • Follow installation instructions to install SQL Server.
  • You may get a message on the Cluster Resource Group page, as seen in the Figure 3, that Available Storage and Cluster Group can’t be used. This is expected given that clustering reserves the Cluster Quorum Disk and Cluster Groups.

Cluster Resource GroupFigure 3 Cluster Resource Group

  • Select the shared disks on the Cluster DiskSelection page, as shown in Figure 4:

Cluster Disk SelectionFigure 4 Cluster Disk Selection

  • On the Cluster Network Configuration page, uncheck DHCP and specify the 2nd Secondary IP of the current node, as shown in Figure 5.

Cluster Network ConfigurationFigure 5 Cluster Network Configuration

Step 7 Add a node to a SQL Server failover cluster

  • On the Secondary Amazon EC2 instances used in failover clustering, run the SQL Server Setup, and then select Add a node to a SQL Server failover.
  • Follow installation instructions to install SQL Server.
  • On the Cluster Network Configuration page, uncheck DHCP and specify the 2nd Secondary IP of the current node, as shown in Figure 6. Click Next.

Cluster Network ConfigurationFigure 6 Cluster Network Configuration

  • Select Yes, confirming that you are deploying SQL Server in a multi-subnet configuration, as shown in Figure 7.

Multi-subnet warningFigure 7 Multi-subnet warning

  • After the installation, open Failover Cluster Manager and connect to the local cluster. Under the Roles node, select SQL Server role and validate those resources are online, as shown in Figure 8. Note that only one IP Address will be online due to multiple subnets.

Failover Cluster Manger – SQL Server RoleFigure 8 Failover Cluster Manger – SQL Server Role

Cleanup

  • Terminate Amazon EC2 instances:
    • In the AWS Management Console, navigate to the “Instances” section.
    • Locate and select the EC2 instances used for the failover cluster.
    • Choose the “Instance State” menu and click “Terminate.”
    • Confirm the termination when prompted.
  • Delete Amazon EBS volumes:
    • Navigate to the AWS Management Console.
    • Under the “Elastic Block Store” section, select “Volumes.”
    • Locate the Amazon EBS volumes created for this setup and select them.
    • Choose the “Actions” menu and click “Delete Volume.”
    • Confirm the deletion when prompted.
  • If you deployed Managed AD:
    • Navigate to the AWS Management Console.
    • From Services, select Directory Service.
    • Locate the Directory created for this setup and select it.
    • Choose the “Actions” menu and click “Delete Directory.”
  • If you deployed AD using AWS Launch Wizard:
    • Navigate to the AWS Management Console.
    • In the search box, search for Launch Wizard.
    • Select Microsoft Active Directory.
    • Locate the Directory created for this setup and select it.
    • Choose the “Actions” menu and click “Delete Directory.”

Conclusion

We walked you through deploying a Microsoft SQL Server failover cluster using Amazon EBS io2 volumes on Windows Server. This solution not only offers a cost-saving benefit by requiring just one set of volumes for the failover cluster, as opposed to the Always On Availability Groups that needs separate volumes for each high availability node, but it also simplifies the failover cluster setup process by enabling you to build a failover cluster using Amazon EBS io2 volumes. While Amazon EBS io2 volumes can only be attached to instances in the same AZ, you can increase availability by deploying FCI in Spread Placement Group, ensuring that instances are not sharing underlying physical hardware.


AWS has significantly more services, and more features within those services, than any other cloud provider, making it faster, easier, and more cost effective to move your existing applications to the cloud and build nearly anything you can imagine. Give your Microsoft applications the infrastructure they need to drive the business outcomes you want. Visit our .NET on AWS and AWS Database blogs for additional guidance and options for your Microsoft workloads. Contact us to start your migration and modernization journey today.

Rafet Ducic

Rafet Ducic

Rafet Ducic is a Senior Solutions Architect at Amazon Web Services (AWS). He applies his more than 20 years of technical experience to help Global Industrial and Automotive customers transition their workloads to the cloud cost-efficiently and with optimal performance. With domain expertise in Database Technologies and Microsoft licensing, Rafet is adept at guiding companies of all sizes toward reduced operational costs and top performance standards.