AWS Storage Blog

Using AWS Systems Manager to upgrade from CloudEndure Disaster Recovery to AWS Elastic Disaster Recovery

AWS Elastic Disaster Recovery (DRS) is the recommended service for disaster recovery to AWS. It’s the next generation of CloudEndure Disaster Recovery (CEDR), as CloudEndure Disaster Recovery technology was used to build Elastic Disaster Recovery.

Now you can upgrade replicating source servers from CloudEndure Disaster Recovery to Elastic Disaster Recovery. This is accomplished via the CEDR Server Upgrade Tool. The process is composed of three steps:

  • Running an assessment
    • The assessment validates that replicating servers using CEDR can successfully be upgraded to Elastic Disaster Recovery.
  • Start upgrade
    • This step is run on each replicating node. It imports a launchable CEDR snapshot into Elastic Disaster Recovery to allow a drill to be performed before completing the upgrade.
  • Finalize upgrade
    • Finalizing the upgrade completes the process – replication starts in Elastic Disaster Recovery and the CloudEndure Agent is removed.

This process requires Python code to be executed on each replicating source server that you plan on upgrading to Elastic Disaster Recovery. One way to perform this remote code execution at scale is to use AWS Systems Manager.

In this post, we describe a method of performing the upgrade using Systems Manager Command documents. By using this method of upgrading CEDR to DRS, you ensure a scalable, repeatable upgrade process by performing the same code execution on any machine that is registered with Systems Manager.

Solution overview

Using AWS Systems Manager to upgrade from CloudEndure Disaster Recovery to AWS Elastic Disaster Recovery

The upgrade process is composed of two separate Python scripts:

  • The first is a project assessment tool that tests the upgrade readiness of the replicating servers in your project. This makes sure that the replication settings and blueprints for the source servers can be duplicated in Elastic Disaster Recovery.
  • The second is the server upgrade tool. This performs the process of importing the point in time snapshot and replacing the CloudEndure Agent with the AWS Replication Agent, and importing the server configuration into the Elastic Disaster Recovery console with all associated settings. Using this tool does NOT require full re-replication of the server. However, it does require a rescan.

Prerequisites

The following prerequisites must be met before continuing:

All Systems Manager documents and instructions can be found in the associated GitHub Repository.

Adding the Command documents to Systems Manager

 To perform the upgrade, you must import the Command documents from the repository to Systems Manager. Once the Command documents are imported, they can be run on any Systems Manager managed node.

Clone the repository:

git clone: https://github.com/aws-samples/cloudendure-to-drs-automation.git
Change into the repository command folder:
cd cloudendure-to-drs-automation/cmd

Create the Systems Manager documents:

aws ssm create-document \
--content file://assessment.yaml \
--name "cedr-to-drs-assessment" \
--document-type "Command" \
--document-format YAML
aws ssm create-document \
--content file://start_upgrade.yaml \
--name "cedr-to-drs-start-upgrade" \
--document-type "Command" \
--document-format YAML
aws ssm create-document \
--content file://finalize_upgrade.yaml \
--name "cedr-to-drs-finalize-upgrade" \
--document-type "Command" \
--document-format YAML

Note that we have uploaded three separate documents to Systems Manager. These correspond to the three steps of the upgrade:

  1. assessment.yaml performs the project assessment.
  2. start_upgrade.yaml initializes the upgrade process on all servers in the designated project.
  3. finalize_upgrade.yaml ends the upgrade process by removing the server from CloudEndure Disaster Recovery.

Running an upgrade

The following steps allow you to run an upgrade:

  1. Run the project assessment on any Systems Manager managed node. This outputs a report that tells you if any servers need remediation before upgrading to Elastic Disaster Recovery. Make sure that you replace all $VARIABLES with the values from your environment:
aws ssm send-command  --targets Key="InstanceIds",Values="$INSTANCEID" --document-name "cedr-to-drs-assessment" --parameters "apiToken=$APITOKEN","projectId=$PROJECTID" --cloud-watch-output-config "CloudWatchOutputEnabled=true,CloudWatchLogGroupName=cedr-upgrade" --region $REGION
  1. Run the start-upgrade document on all Systems Manager managed nodes to be upgraded. This begins the upgrade process by importing the most recent completed snapshot for testing. Make sure that you replace all $VARIABLES with the values from your environment. In this example, we are targeting all instances with a tag key of “cedr-attribute” and a tag value of “upgrade”. This must be set on all instances you upgrade. You can also select instances manually or by resource group.
aws ssm send-command --targets Key="tag:cedr-attribute",Values="upgrade" --document-name "cedr-to-drs-start-upgrade" --parameters "apiToken=$APITOKEN","projectId=$PROJECTID","awsAccessKey=$AWSACCESSKEY","awsSecretAccessKey=$AWSSECRETACCESSKEY" --cloud-watch-output-config "CloudWatchOutputEnabled=true,CloudWatchLogGroupName=cedr-upgrade" --region $REGION

Once the installation is complete, you can see the server is added to the Elastic Disaster Recovery console on the Source servers page.

Once the installation is complete, you can see the server is added to the Elastic Disaster Recovery console on the Source servers page.

Important note

Now you can launch a drill instance for the source server to verify that the server was imported as expected before finalizing the upgrade.

  1. Run the finalize-upgrade document on all Systems Manager managed nodes to be upgraded. This finalizes the upgrade process and removes the CloudEndure Agent from the source server. Make sure that you replace all $VARIABLES with the values from your environment. In this example, we are targeting all instances with a tag key of “cedr-attribute” and a tag value of “upgrade”. This must be set on all instances you upgrade. You can also select instances manually or by resource group.
aws ssm send-command --targets Key="tag:cedr-attribute",Values="upgrade" --document-name "cedr-to-drs-finalize-upgrade" --parameters "apiToken=$APITOKEN","projectId=$PROJECTID","awsAccessKey=$AWSACCESSKEY","awsSecretAccessKey=$AWSSECRETACCESSKEY" --cloud-watch-output-config "CloudWatchOutputEnabled=true,CloudWatchLogGroupName=cedr-upgrade" --region $REGION

The finalize-upgrade command imports the source server snapshot into Elastic Disaster Recovery, and the rescan begins. At this point, the CloudEndure Agent is uninstalled and replaced with the AWS Replication Agent.

The finalize-upgrade command imports the source server snapshot into Elastic Disaster Recovery, and the rescan begins. At this point, the CloudEndure Agent is uninstalled and replaced with the AWS Replication Agent.

Once rescan is complete, the server reaches Ready for recovery status and you can see your source servers in the Elastic Disaster Recovery console as they populate over time.

Once rescan is complete, the server reaches Ready for recovery status and you can see your source servers in the Elastic Disaster Recovery console as they populate over time.

Cleaning up

 Once you have successfully upgraded all of your servers from CloudEndure Disaster Recovery to Elastic Disaster Recovery, and the servers have reached the point in time retention period, you can delete your CloudEndure Disaster Recovery Account.

Conclusion

In this post, we showed how you can use AWS Systems Manager to perform remote code execution to upgrade from CloudEndure Disaster Recovery to AWS Elastic Disaster Recovery. This allows for unattended upgrades at scale.

Performing your upgrades at scale in an unattended fashion allows for a faster upgrade process, which will allow you to start leveraging the enhanced capabilities of DRS quicker than manually performing the upgrade across all machines.

Contributions and feedback are welcome on the Associated GitHub Repository.

Kevin Lewin

Kevin Lewin

Kevin is a Cloud Operations Specialist Solution Architect at Amazon Web Services. He focuses on helping customers achieve their operational goals through observability and automation.

Ken Sze

Ken Sze

Based in the Boston area, Ken is a Solutions Architect on the AWS CloudEndure team. Ken joined AWS in 2019 as part of the CloudEndure acquisition. He has been working with customers for over 10 years as an IT professional, helping them design backup and disaster recovery solutions. Outside of work, he enjoys riding his motorcycle and finding good places to eat.