Synchronize Amazon EC2 instance tags and instance type with AWS Elastic Disaster Recovery source servers

When performing disaster recovery, you recover your original systems and IT infrastructure to their original state at an alternate, available site. When you recover your servers, the recovered servers should match the original compute infrastructure to reduce the risk of underprovisioning or overprovisioning your recovery environment. This improves the likelihood that your recovery servers have similar performance and optimizes your recovery environment costs. In addition to matching your original compute infrastructure, restoring the metadata related to your servers is also important. Tags are used in AWS to assign metadata to your AWS resources and enable identification and cost allocation. Tagging is also a best practice of the Operational Excellence pillar of the Well Architected Framework.

AWS Elastic Disaster Recovery makes it easy for you to establish disaster recovery for your Amazon Elastic Compute Cloud (Amazon EC2) instances. You can use Elastic Disaster Recovery to replicate and recover your EC2 instances to a different AWS Availability Zone, Region, or to another AWS account in a different AWS Region. If you use Elastic Disaster Recovery for EC2, you also need to replicate the tags and update the EC2 instance type in the corresponding Elastic Disaster Recovery source servers.

In this post, I present a solution that synchronizes the tags from your EC2 instances running in the same AWS account or different AWS accounts to your source servers in AWS Elastic Disaster Recovery. The solution also provides an option to update the recovery server instance type to the same instance type as the original EC2 instance. This ensures that the exact same EC2 instance type is used for your recovery instances. Furthermore, the solution also adds tags identifying the original account, region, instance id, and instance type to each Elastic Disaster Recovery source server.

Solution overview

Tagging your source servers with the same tags as the corresponding EC2 instances enables you to organize your Elastic Disaster Recovery source servers by important attributes, such as application or workload properties. Tags can help you coordinate the sequence of server failover as well as the automation that must be applied before or after the source servers are recovered. When you’re ready to fail back your recovered EC2 instances, it’s important to correlate each recovery instance to its original AWS account, region, and source EC2 instance-id. Tags can be used to preserve this metadata about your source servers.

The drs_synch_ec2_tags_and_instance_type AWS Lambda function performs the tag and EC2 instance type synchronization in the solution. The following high-level actions are performed by this lambda function:

Read all the source servers in the Elastic Disaster Recovery service AWS account and region specified.
Read all the EC2 instances in the AWS account and region specified.
For each Elastic Disaster Recovery source server, determine any matching EC2 instance.
If a match is found to an EC2 instance:
1. Create a tag on the Elastic Disaster Recovery source server for each tag present on the EC2 instance.
2. Create the following additional tags on the Elastic Disaster Recovery source server corresponding to the EC2 instance:
  - source:account: The AWS Account ID where the EC2 instance is running.
  - source:region: The AWS Region where the EC2 instance is running.
  - source:instance-id: The EC2 instance ID of the EC2 instance.
  - source:instance-type: The AWS instance type for the EC2 instance.
3. Update the unique EC2 launch template for the Elastic Disaster Recovery source server with the same instance type as the corresponding EC2 instance.
4. Disable the Instance type right-sizing feature for the Elastic Disaster Recovery source server so that the instance type specified in the launch template will be used.

The source code for the solution and complete deployment instructions can be found in the drs-tools GitHub repository.

Prerequisites

You need permissions in each AWS account where the solution is configured to deploy the related AWS services. The solution is deployed via two AWS CloudFormation templates:

yaml: This template deploys the drs_synch_ec2_tags_and_instance_type AWS Lambda Function, AWS Identity and Access Management (IAM) role assumed by the Lambda function, and an Amazon CloudWatch Events rule to execute the Lambda function on a schedule.
yaml: This template deploys the IAM role that the drs_synch_ec2_tags_and_instance_type Lambda function assumes to describe and update Elastic Disaster Recovery source servers.

Configuration

You can selectively initiate tag synchronization or instance type updates by sending the following event payload with either true or false as key values to the Lambda function:

{
"synch_tags": true,
"synch_instance_type": true
}

The CloudWatch event rate based rule created from the drs_synch_ec2_tags_and_instance_type_lambda.yaml CloudFormation template enables both of these features and executes the function on a daily basis. You can manually initiate the Lambda function by providing the payload described here with your desired options.

When synch_tags is set to true, the solution replicates the tags for each original EC2 instance server to its corresponding Elastic Disaster Recovery source server. It also updates the launch settings for the Elastic Disaster Recovery source server to transfer server tags to drill and recovery instances. In addition to replicating the original EC2 instance tags, the following tags will also be added to each corresponding Elastic Disaster Recovery source server:

source:account: The AWS Account ID where the EC2 instance is running.
source:region: The AWS Region where the EC2 instance is running.
source:instance-id: The EC2 instance ID of the EC2 instance.
source:instance-type: The AWS instance type for the EC2 instance.

When synch_instance_type is set to true, the solution retrieves the current launch template for the Elastic Disaster Recovery source server, and determines if the current launch template version is configured with a different instance type than the corresponding EC2 instance. If the instance type in the current launch template is different, it will create a new launch template version that uses the same EC2 instance type as the original EC2 instance. This new launch template version will be set as the default launch template version for the Elastic Disaster Recovery source server. The solution will also disable instance type right-sizing for the source server.

Deployment

The Elastic Disaster Recovery service can be implemented for EC2 instances running in multiple AWS accounts or a single AWS account. The solution can be used for either of these approaches.

Single account

In a single account Elastic Disaster Recovery design, you will have your EC2 instances running in one or more AWS Regions (workloads Region(s)) and Elastic Disaster Recovery configured in an alternative disaster recovery Region.

Figure 1 Single account, multiple regions diagram

In this scenario, you deploy the drs_synch_ec2_tags_and_instance_type Lambda function into each Region where your EC2 instances are running. The provided CloudFormation template also includes a CloudWatch rate-based event that will execute the Lambda function on a schedule of your choice (default: daily).

Moreover, you’ll deploy the drs_synch_ec2_tags_and_instance_type IAM role into your AWS account in a Region of your choice. IAM roles are a global resource so the role can be used for all Regions. This role will be assumed by the Lambda function to describe and update the source servers in Elastic Disaster Recovery.

Follow the complete instructions provided in the GitHub repository for single account deployment.

Multi-account

In a multi-account Elastic Disaster Recovery deployment, you may configure the Elastic Disaster Recovery service in multiple AWS accounts where your EC2 instances are running. When you are ready to perform a drill or recovery, you can recovery your instances into the same AWS account or a different AWS target account:

Figure 2 Multi account, multiple regions diagram

For this type of design, follow the same deployment instructions as the single account instructions for each AWS account where you have EC2 instances replicating to Elastic Disaster Recovery. Source servers are always updated in the account where the Elastic Disaster Recovery servers are replicating.

Your multi-account deployment design may have multiple Amazon EC2 workload accounts replicating into a common Elastic Disaster Recovery target account:

Figure 3 Multiple DRS accounts, single target DRS recover account diagram

In this scenario, you deploy the drs_synch_ec2_tags_and_instance_type Lambda function into each Region and each AWS account where your EC2 instances are running. The provided CloudFormation template also includes a CloudWatch rate-based event that will execute the Lambda function on a schedule of your choice (default: daily) in each account and Region.

Furthermore, you deploy the drs_synch_ec2_tags_and_instance_type IAM role into your AWS account where the Elastic Disaster Recovery is configured. IAM roles are a global resource, thus the role can be used for all AWS Regions. This role will be assumed by the Lambda function running in each of your Amazon EC2 workload accounts to describe and update the source servers in the Elastic Disaster Recovery service with related EC2 instance tags.

Follow the complete instructions provided in the GitHub repository for multi-account deployment.

Cleaning up

To clean up the deployed solution, delete the CloudFormation stacks that you deployed.

Conclusion

In this blog, I presented a solution for synchronizing your EC2 instance tags with your Elastic Disaster Recovery source servers to help you preserve your tag management strategy. These tags can be used to differentiate recovery requirements, recovery sequence, cost allocation, automation, and failback targets. The solution I presented in this blog also updates each Elastic Disaster Recovery source server with the original Amazon EC2 instance type compute choice. Ensuring that your recovery instances use the original instance type improves the likelihood that your recovery instances have similar performance and optimizes your recovery environment costs.

You can use the solution presented here along with other solutions to manage Elastic Disaster Recovery at scale, such as managing Elastic Disaster Recovery launch templates at scale. Explore the drs-tools GitHub repository which includes a host of solutions that can help enhance your Elastic Disaster Recovery solution and help you achieve better outcomes from your Elastic Disaster Recovery deployments.

Feel free to post any questions or comments below.