AWS Compute Blog

Automating the Creation of Consistent Amazon EBS Snapshots with Amazon EC2 Systems Manager (Part 2)

Nicolas Malaval, AWS Professional Consultant

In my previous blog post, I discussed the challenge of creating Amazon EBS snapshots when you cannot turn off the instance during backup because this might exclude any data that has been cached by any applications or the operating system. I showed how you can use EC2 Systems Manager to run a script remotely on EC2 instances to prepare the applications and the operating system for backup and to automate the creating of snapshots on a daily basis. I gave a practical example of creating consistent Amazon EBS snapshots of Amazon Linux running a MySQL database.

In this post, I walk you through another practical example to create consistent snapshots of a Windows Server instance with Microsoft VSS (Volume Shadow Copy Service).

Understanding the example

VSS (Volume Shadow Copy Service) is a Windows built-in service that coordinates backup of VSS-compatible applications (SQL Server, Exchange Server, etc.) to flush and freeze their I/O operations.

The VSS service initiates and oversees the creation of shadow copies. A shadow copy is a point-in-time and consistent snapshot of a logical volume. For example, C: is a logical volume, which is different than an EBS snapshot. Multiple components are involved in the shadow copy creation:

  • The VSS requester requests the creation of shadow copies.
  • The VSS provider creates and maintains the shadow copies.
  • The VSS writers guarantee that you have a consistent data set to back up. They flush and freeze I/O operations, before the VSS provider creates the shadow copies, and release I/O operations, after the VSS provider has completed this action. There is usually one VSS writer for each VSS-compatible application.

I use Run Command to execute a PowerShell script on the Windows instance:

$EbsSnapshotPsFileName = "C:/tmp/ebsSnapshot.ps1"

$EbsSnapshotPs = New-Item -Type File $EbsSnapshotPsFileName -Force

Add-Content $EbsSnapshotPs '$InstanceID = Invoke-RestMethod -Uri http://169.254.169.254/latest/meta-data/instance-id'
Add-Content $EbsSnapshotPs '$AZ = Invoke-RestMethod -Uri http://169.254.169.254/latest/meta-data/placement/availability-zone'
Add-Content $EbsSnapshotPs '$Region = $AZ.Substring(0, $AZ.Length-1)'
Add-Content $EbsSnapshotPs '$Volumes = ((Get-EC2InstanceAttribute -Region $Region -Instance "$InstanceId" -Attribute blockDeviceMapping).BlockDeviceMappings.Ebs |? {$_.Status -eq "attached"}).VolumeId'
Add-Content $EbsSnapshotPs '$Volumes | New-EC2Snapshot -Region $Region -Description " Consistent snapshot of a Windows instance with VSS" -Force'
Add-Content $EbsSnapshotPs 'Exit $LastExitCode'

First, the script writes in a local file named ebsSnapshot.ps1 a PowerShell script that creates a snapshot of every EBS volume attached to the instance.

$EbsSnapshotCmdFileName = "C:/tmp/ebsSnapshot.cmd"
$EbsSnapshotCmd = New-Item -Type File $EbsSnapshotCmdFileName -Force

Add-Content $EbsSnapshotCmd 'powershell.exe -ExecutionPolicy Bypass -file $EbsSnapshotPsFileName'
Add-Content $EbsSnapshotCmd 'exit $?'

It writes in a second file named ebsSnapshot.cmd a shell script that executes the PowerShell script created earlier.

$VssScriptFileName = "C:/tmp/scriptVss.txt"
$VssScript = New-Item -Type File $VssScriptFileName -Force

Add-Content $VssScript 'reset'
Add-Content $VssScript 'set context persistent'
Add-Content $VssScript 'set option differential'
Add-Content $VssScript 'begin backup'

$Drives = Get-WmiObject -Class Win32_LogicalDisk |? {$_.VolumeName -notmatch "Temporary" -and $_.DriveType -eq "3"} | Select-Object DeviceID

$Drives | ForEach-Object { Add-Content $VssScript $('add volume ' + $_.DeviceID + ' alias Volume' + $_.DeviceID.Substring(0, 1)) }

Add-Content $VssScript 'create'
Add-Content $VssScript "exec $EbsSnapshotCmdFileName"
Add-Content $VssScript 'end backup'

$Drives | ForEach-Object { Add-Content $VssScript $('delete shadows id %Volume' + $_.DeviceID.Substring(0, 1) + '%') }

Add-Content $VssScript 'exit'

It creates a third file named scriptVss.txt containing DiskShadow commands. DiskShadow is a tool included in Windows Server 2008 and above, that exposes the functionality offered by the VSS service. The script creates a shadow copy of each logical volume stored on EBS, runs the shell script ebsSnapshot.cmd to create a snapshot of underlying EBS volumes, and then deletes the shadow copies to free disk space.

diskshadow.exe /s $VssScriptFileName
Exit $LastExitCode

Finally, it runs DiskShadow in script mode.

This PowerShell script is contained in a new SSM document and the maintenance window executes a command from this document every day at midnight on every Windows instance that has a tag “ConsistentSnapshot” equal to “WindowsVSS”.

Implementing and testing the example

First, use AWS CloudFormation to provision some of the required resources in your AWS account.

  1. Open Create a Stack to create a CloudFormation stack from the template.
  2. Choose Next.
  3. Enter the ID of the latest AWS Windows Server 2016 Base AMI available in the current region (see Finding a Windows AMI) in pWindowsAmiId.
  4. Follow the on-screen instructions.

CloudFormation creates the following resources:

  • A VPC with an Internet gateway attached.
  • A subnet on this VPC with a new route table, to enable access to the Internet and therefore to the AWS APIs.
  • An IAM role to grant an EC2 instance the required permissions.
  • A security group that allows RDP access from the Internet, as you need to log on to the instance later on.
  • A Windows instance in the subnet with the IAM role and the security group attached.
  • A SSM document containing the script described in the section above to create consistent EBS snapshots.
  • Another SSM document containing a script to restore logical volumes to a consistent state, as explained in the next section.
  • An IAM role to grant the maintenance window the required permissions.

After the stack creation completes, choose Outputs in the CloudFormation console and note the values returned:

  • IAM role for the maintenance window
  • Names of the two SSM documents

Then, manually create a maintenance window, if you have not already created it. For detailed instructions, see the “Example” section in the previous blog post.

After you create a maintenance window, assign a target where the task will run:

  1. In the Maintenance Window list, choose the maintenance window that you just created.
  2. For Actions, choose Register targets.
  3. For Owner information, enter WindowsVSS.
  4. Under the Select targets by section, choose Specifying tags. For Tag Name, choose ConsistentSnapshot. For Tag Value, choose WindowsVSS.
  5. Choose Register targets.

Finally, assign a task to perform during the window:

  1. In the Maintenance Window list, choose the maintenance window that you just created.
  2. For Actions, choose Register tasks.
  3. For Document, select the name of the SSM document that was returned by CloudFormation, with which to create snapshots.
  4. Under the Target by section, choose the target that you just created.
  5. Under the Role section, select the IAM role that was returned by CloudFormation.
  6. Under Execute on, for Targets, enter 1. For Stop after, enter 1 errors.
  7. Choose Register task.

You can view the history either in the History tab of the Maintenance Windows navigation pane of the Amazon EC2 console, as illustrated on the following figure, or in the Run Command navigation pane, with more details about each command executed.

Restoring logical volumes to a consistent state

DiskShadow―the VSS requester in this case―uses the Windows built-in VSS provider. To create a shadow copy, this built-in provider does not make a complete copy of the data. Instead, it keeps a copy of a block data before a change overwrites it, in a dedicated storage area. The logical volume can be restored to its initial consistent state, by combining the actual volume data with the initial data of the changed blocks.

The DiskShadow command create instructs the VSS service to proceed with the creation of shadow copies, including the release of I/O operations by the VSS writers after the shadow copies are created. Therefore, the EBS snapshots created by the next command exec may not be fully consistent.

Note: A workaround could be to build your own VSS provider in charge of creating EBS snapshots. Doing so would enable the EBS snapshots to be created before I/O operations are released. We will not develop this solution in this blog post.

Therefore, you need to “undo” any I/O operations that may have happened between the moment when the shadow copy was created and the moment when the EBS snapshots were created.

A solution consists of creating an EBS volume from the snapshot, attaching it to an intermediate Windows instance and to “revert” the VSS shadow copy to restore the EBS volume to a consistent state. For sake of simplicity, use the Windows instance that was backed up as the intermediate instance.

To manually restore an EBS snapshot to a consistent state:

  1. In the Amazon EC2 console, choose Instances.
  2. In the search box, enter Consistent EBS Snapshots – Windows with VSS. The search results should display a single instance. Note the Availability Zone for this instance.
  3. Choose Snapshots.
  4. Select the latest snapshot with the description “Consistent snapshot of Windows with VSS” and choose Actions, Create Volume.
  5. Select the same Availability Zone as the instance and choose Create, Volumes.
  6. Select the volume that was just created and choose Actions, Attach Volume.
  7. For Instance, choose Consistent EBS Snapshots – Windows with VSS and choose Attach.
  8. Choose Run Command, Run a command.
  9. In Command document, select the name of a SSM document to restore snapshots returned by CloudFormation. For Target instances, select the Windows and choose Run.

Run Command executes the following PowerShell script on the Windows instance. It retrieves the list of offline disks—which corresponds in this case to the EBS volume that you just attached—and for each offline disk, takes it online, revert existing shadow copies and takes it offline again.

$OfflineDisks = (Get-Disk |? {$_.OperationalStatus -eq "Offline"})

foreach ($OfflineDisk in $OfflineDisks) {
  Set-Disk -Number $OfflineDisk.Number -IsOffline $False
  Set-Disk -Number $OfflineDisk.Number -IsReadonly $False
  Write-Host "Disk " $OfflineDisk.Signature " is now online"
}

$ShadowCopyIds = (Get-CimInstance Win32_ShadowCopy).Id
Write-Host "Number of shadow copies found: " $ShadowCopyIds.Count

foreach ($ShadowCopyId in $ShadowCopyIds) {
  "revert " + $ShadowCopyId | diskshadow
}

foreach ($OfflineDisk in $OfflineDisks) {
  $CurrentSignature = (Get-Disk -Number $OfflineDisk.Number).Signature
  if ($OfflineDisk.Signature -eq $CurrentSignature) {
    Set-Disk -Number $OfflineDisk.Number -IsReadonly $True
    Set-Disk -Number $OfflineDisk.Number -IsOffline $True
    Write-Host "Disk " $OfflineDisk.Number " is now offline"
  }
  else {
    Set-Disk -Number $OfflineDisk.Number -Signature $OfflineDisk.Signature
    Write-Host "Reverting to the initial disk signature: " $OfflineDisk.Signature
  }
}

The EBS volume is now in a consistent state and can be detached from the intermediate instance.

Conclusion

In this series of blog posts, I showed how you can use Amazon EC2 Systems Manager to create consistent EBS snapshots on a daily basis, with two practical examples for Linux and Windows. You can adapt this solution to your own requirements. For example, you may develop scripts for your own applications.

If you have questions or suggestions, please comment below.