AWS Storage Blog

Automating application-consistent Amazon EBS Snapshots for Windows applications

Customers have been running Microsoft workloads on AWS for over 16 years. Through conversations with these customers, a common challenge we’ve found is that as they back up their Windows applications to fulfill data protection needs, they often spend significant time and manual effort managing the orchestration of backup workflows. The time- and labor-intensive process exposes customers to human errors that can lead to missed backups and increased storage costs.

If you have Amazon Elastic Compute Cloud (Amazon EC2) instances running Windows applications such as Windows Server, you may want to consider using Volume Shadow Copy Service (VSS) to create VSS application-consistent EBS Snapshots. VSS works in conjunction with Amazon Elastic Block Store (Amazon EBS) Snapshots to make sure that application data can be backed up without taking the applications offline, and that any ongoing I/O operations on VSS-aware applications are quiesced before EBS Snapshots are initialized. With Amazon Data Lifecycle Manager, a policy-based lifecycle management solution for EBS Snapshots, you can now automate the creation, retention, and deletion of VSS-enabled EBS Snapshots.

In this post, we walk through the use of Amazon Data Lifecycle Manager and AWS Systems Manager to automate the creation and retention of VSS-enabled EBS Snapshots for your EC2 instances running Windows applications. The solution empowers you to create VSS application-consistent snapshots of your Windows applications with confidence. These snapshots serve as reliable backups that you can depend on for disaster recovery, data migration, or other critical operational needs.

Solution overview

Previously, we outlined how to Create application-consistent snapshots using Amazon Data Lifecycle Manager and custom scripts, including the necessary steps to create Amazon Data Lifecycle Manager policies that use Systems Manager Agent to run scripts on your EC2 instances. To create those snapshots, Amazon Data Lifecycle Manager uses AWSEC2-CreateVssSnapshot to coordinate with VSS agent on your instance to flush buffer to disk, freeze I/O, initiate snapshots, and then thaw I/O, as shown in the following figure.

Architectural diagram for DLM pre-script and post-script automation for Windows application.

Prerequisites

You must first complete all the prerequisites steps to create VSS-enabled EBS Snapshots, including installing Systems Manager Agent, attaching the required AWS Identity and Access Management (IAM) role to the instance profile, and installing the VSS component. If you are using one of these Amazon Machine Images (AMIs) provided by AWS, then Systems Manager Agent has already been preinstalled. You must also make sure the IAM service role that is used for your Amazon Data Lifecycle Manager policy has the appropriate permissions to run the Systems Manager documents on the targeted EC2 instances. The easiest way to do this is to attach the AWSDataLifecycleManagerSSMFullAccess IAM policy to the IAM role.

Walkthrough

To automate and validate the creation of VSS-enabled snapshots, you must follow these steps:

  1. Create an Amazon Data Lifecycle Manager policy
  2. Validate that the snapshots created are VSS application-consistent

Step 1: Create an Amazon Data Lifecycle Manager policy

Now we create Amazon Data Lifecycle Manager policies to automate the creation and management of VSS-enabled snapshots. The following outline steps are required when creating the policy through the Amazon EC2 console. However, you can also create the policy by using API/CLI and AWS CloudFormation.

If you already have policies creating crash-consistent snapshots of instances running Windows applications, then you can modify those policies and enable pre/post script feature. As long as all the other prerequisites have been met, your policies will start creating VSS application-consistent snapshots the next time it runs.

1. To get started, launch the Amazon EC2 console, then select Lifecycle Manager under Elastic Block Store in the left-side navigation panel. Under Schedule-based policy, select EBS snapshot policy.

2. In Target resource types, select Instance, and then supply tags for all instances that you want to target, as shown in the following figure. In this example, we target all instances with the tag (VSSSnapshot:true). Add a description for the policy.

Create policy by targeting instances and adding tags of instances to target.

3. For IAM role, most customers should select Default role, as it contains all the permissions required for the policy actions, as shown in the following figure. When creating/modifying policies through the console, the AWSDataLifecycleManagerSSMFullAccess IAM policy (which has all the permissions for this feature) automatically attaches to the Default role. If you are using API/CLI to create/modify policies for this feature, then you must manually attach the IAM policy to the Default role. If you choose to use a Custom IAM role, then you must make sure the IAM role has all the required permissions to run SSM documents on the targeted instances.

Select default IAM role for most use cases.

4. On the next page, setup your policy creation schedule, as shown in the following figure. In this example, we are creating snapshots every 24 hours at 11:00 UTC and retaining them for 7 days.

Set creation frequency and retention period.

5. Under Advanced Settings, make sure you check the box to Enable pre and post scripts for this schedule, as shown in the following figure. Next, select the tile labeled VSS Backup.

Screenshot with VSS Backup tile selected

You can set Retry script if it fails to automatically retry executing the AWSEC2-CreateVssSnapshot document. You should consider this if you want a higher likelihood of VSS snapshots being created, and your application can withstand being quiesced repeatedly or in a quiesced state for a longer period of time.

We recommend that you also enable Default to crash-consistent snapshots if script fails. If enabled, Amazon Data Lifecycle Manager attempts to create crash-consistent snapshots if it is unable to successfully run the Systems Manager document. You can use the tags applied to the snapshots as well as Amazon EventBridge to later determine if the EBS Snapshots were created as part of successful executions of the Systems Manager document.

6. Under Advanced Settings, you can also set the policy to automate other actions such as Cross-region copy and Cross-account sharing, as shown in the following figure. In this example, we are setting the policy to make sure the most recent VSS snapshots of each EC2 instance have Fast Snapshot Restore enabled in the us-east-1a Availability Zone, so that volumes created from those snapshots instantly deliver all of their provisioned performance.

Screenshot of Fast Snapshot Restore enabled.

7. If you enable Cross-Region copy, then you should also enable Copy tags from source so that the Amazon Data Lifecycle Manager system tags indicating the snapshot is VSS application-consistent are copied together with the snapshot, as shown in the following figure.

Screenshot of Cross-Region copy

Step 2: Validate that the snapshots created are application-consistent

Once your Amazon Data Lifecycle Manager policy has created an EBS Snapshot, you can check if it is a VSS-enabled application-consistent snapshot.

1. Navigate to Amazon EC2 console and select Snapshots.

Select the snapshot(s) you would like to validate

2. Select the snapshot and select Tags in the bottom panel. If you see the tag key ‘aws:dlm:lifecycle-policy-id’, then the snapshot was also created (and is managed) by Amazon Data Lifecycle Manager. If you see a tag for ‘AppConsistent:true’, then the policy successfully created the snapshots through this process, as shown in the following figure. If you see either ‘aws:dlm:pre-script: FAIL’ or aws:dlm:post-script: FAIL’ system tags, then the snapshot is crash-consistent and not application-consistent. You can also use CloudWatch and EventBridge to monitor the success and failure of your policy in creating VSS application-consistent snapshots.

Checking snapshot to see if it is VSS enabled

Cleaning up

Clean up the snapshots created during the previous steps to make sure you do not incur storage charges. You can do this by navigating to the Snapshots screen, searching for all snapshots created by the policy, selecting all the snapshots, and then selecting Actions followed by Delete snapshot.

Similarly, you should delete the Amazon Data Lifecycle Manager policy to make sure no future snapshots are created by the policy. You can do this by navigating to the Lifecycle Manager screen, selecting the policy, and then selecting Actions followed by Delete lifecycle policy.

Conclusion

In this post, we went through how to automate the creation and retention of VSS application-consistent EBS Snapshots with Amazon Data Lifecycle Manager. We hope this reduces the amount of time and effort required to enhance the data protection of your Windows applications running on EC2 instances.

With Amazon Data Lifecycle Manager, you have the ability to exclude the root/boot volume when creating a set of application-consistent snapshots. You can also set your policy to manage Fast Snapshot Restore on the most recent set of snapshots so that you can create new EBS volumes that deliver maximum performance without needing to be initialized. Furthermore, you can automatically share the snapshots with different accounts and copy snapshots to different AWS Regions. Best of all, Amazon Data Lifecycle Manager policies are free to create, and they save you from having to use third-party tools or develop/maintain complex custom scripts. Similarly, you can also use AWS Backup to automate the creation of Windows VSS backups.

As a final takeaway, we encourage you to try this on your own environments. You can also learn more about this feature by reading our technical documentation and exploring different use cases for using pre- and post-scripts with Amazon Data Lifecycle Manager.

We welcome your feedback. If you have questions or suggestions, leave them in the comments section.

Mengfan Zhu

Mengfan Zhu

Mengfan Zhu is a Software Development Engineer for Amazon Elastic Block Store. She is passionate about using technology to deliver innovative and effective solutions for customers’ challenges. With a background in mathematics and computer science, she brings a strong analytical mindset and a deep understanding of the software development lifecycle.

Owen Medina

Owen Medina

Owen Medina is a Software Development Engineer for Amazon Elastic Block Store (Amazon EBS). He is driven by a deep intellectual curiosity and a keen interest for engineering exceptional solutions for complex and practical problems. Owen has a wealth of experience in all facets of the software development lifecycle, with a focus on architecting highly scalable and reliable systems.

Hari Mani

Hari Mani

Hari Mani is a Senior Product Manager for Windows Platform and is part of the Amazon EC2 team. During his free time, Hari likes to dance salsa and play golf.