How to restore archived Amazon EC2 backup recovery points from the Amazon S3 Glacier storage classes

This is the second post in a two-part series. In part one, we described a process to automatically archive Amazon EC2 backup recovery points from AWS Backup to an Amazon S3 bucket in one of the Amazon S3 Glacier storage classes. In this post, we describe the process to restore an archived EC2 backup recovery point from any of the Amazon S3 Glacier storage classes. Once restored, an Amazon Machine Image (AMI) will be available for use in an AWS Region.

Technical details on restore times

In this section, we discuss the different scenarios and technical details you need to know before performing a restore operation.

The scenarios are:

Restore archived EC2 backup recovery points from the Amazon S3 Glacier Instant Retrieval storage class.
Restore archived EC2 backup recovery points from the Amazon S3 Glacier Flexible Retrieval storage class.
Restore archived EC2 backup recovery points from the Amazon S3 Glacier Deep Archive storage class.

Amazon S3 objects that are stored in the Amazon S3 Glacier Instant Retrieval storage class are instantly available for retrieval in milliseconds. This is because S3 Glacier Instant Retrieval delivers the fastest access to archive storage, with the same throughput and milliseconds access as the S3 Standard and S3 Standard-Infrequent Access (S3 Standard-IA) storage classes.

Amazon S3 objects that are stored in the Amazon S3 Glacier Flexible Retrieval or Amazon S3 Glacier Deep Archive storage classes are not immediately accessible. To access an object in these storage classes, you must first restore a temporary copy for a specified duration (number of days) using the RestoreObject API. The temporary copy of the object (stored in the Amazon S3 Standard storage class) is available alongside the archived object that’s in the S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage class. When you restore an archived object, you are paying for both the archive and a copy that you restored temporarily for a specified amount of time. For information about pricing, see Amazon S3 pricing.

Restored objects in the Amazon S3 Glacier Flexible Retrieval and Amazon S3 Glacier Deep Archive storage classes are stored only for the number of days that you specify. To calculate the expiry date, Amazon S3 adds the number of days that you specify to the time you request to restore the object, and then rounds to the next day at midnight UTC. After the expiry-date elapses, the temporary copy is removed. Archived objects in the Amazon S3 Glacier Flexible Retrieval storage class have Expedited, Standard, and Bulk retrieval options. Archived objects in the Amazon S3 Glacier Deep Archive storage class have Standard and Bulk retrieval options. Refer to archive retrieval options for more information.

Note:

You can only restore to the same AWS Region as the recovery point (AMI) was first stored to an S3 bucket using the CreateStoreImageTask API. Review the documentation for usage requisites on the CreateRestoreImageTask API.

Solution overview

This solution uses an event-driven architecture built using the following services: Amazon S3, AWS Step Functions, AWS Lambda, AWS IAM, and Amazon SNS. The solution has a workflow that takes an archived Amazon EC2 backup recovery point as an input, and performs a series of steps based on Amazon S3 Glacier storage class type. Whenever there is a need to restore an AMI from an archived EC2 recovery point, you need to trigger the solution with a payload which will be explained further in the “Test the solution” section later on in this blog.

If an archived EC2 backup recovery point is stored in the S3 Glacier Instant Retrieval storage class, the workflow will directly perform the CreateRestoreImageTask API action to generate a new AMI. If an archived EC2 backup recovery point belongs to either S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, the workflow will first restore a temporary copy of the archived object and then perform the CreateRestoreImageTask API action to generate a new AMI.

Figure 1: Solution architecture for restoring archived Amazon EC2 backup recovery points from Amazon S3 Glacier storage classes

Figure 1: Solution architecture for restoring archived Amazon EC2 backup recovery points from Amazon S3 Glacier storage classes

Workflow steps

You can set up the entire automated workflow, described in this section in detail, using the AWS CloudFormation template provided in the following “Getting started” section in this blog.

1. The AWS Step Functions execution is triggered by the user with an input payload containing information about the S3 bucket where the recovery points are stored. Refer to the “Test the solution” section below to understand the input payload structure in more detail. The state machine has a definition to perform a series of steps which includes executing multiple Lambda functions.

2. In this step, the storage class of the archived recovery point is determined using the S3 Boto3 API.

a. If the S3 Glacier storage class is either S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, then the object is first restored using the RestoreObject API; this is an asynchronous task. A workflow status variable is set to “preparing-for-restore” and the process is monitored using the HeadObject API for every 60 seconds until the process is completed. Once a temporary copy of the object is successfully restored from S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, the workflow status variable is set to “ready-to-restore” and the workflow will move onto the next step (workflow step 3).

b. If the S3 Glacier storage class is S3 Glacier Instant Retrieval, then the workflow status variable value is changed to “ready-to-restore” and the workflow will move on to the next step (workflow step 3).

3. In this step, the EC2 CreateRestoreImage Task API is invoked to create an EC2 AMI from the archived backup object (.bin file). This is an asynchronous task and it will be monitored in step 4.

4. The status of AMI creation is monitored using EC2 DescribeImages API. Once the AMI creation is completed, it will be available for the users to launch the EC2 Instance.

5. An email notification is sent to the user who has subscribed to Amazon SNS notifications (configured during solution deployment). The email body will contain the summary of the execution and the AMI id that was restored. If there are any errors during the execution, then an email will be sent containing the exception details.

The following diagram illustrates the AWS Step Functions workflow.

Figure 2: AWS Step Functions state machine definition showing each step involved in the restore process

Figure 2: AWS Step Functions state machine definition showing each step involved in the restore process

Getting started

In this section, we cover this solution’s prerequisites and instructions to deploy the solution using AWS CloudFormation.

Prerequisites

Assuming that you have already deployed the solution we provided in part one of the blog series, below are the prerequisites you will need to have.

1. Make a note of the below resources created in the part one solution.

a. Name of Amazon S3 bucket

b. Name of the Lambda execution role

2. A user/group email address to receive the status notifications.

3. IAM permissions to create AWS CloudFormation stack.

Deploy the solution

We created an AWS CloudFormation template that you can launch to deploy the entire solution within minutes in the same AWS Region where you’ve deployed the solution in part one. This template creates the following resources in your account:

Amazon SNS topic to receive workflow status notifications.
AWS IAM Policy for the AWS Lambda functions to publish messages to the SNS topic.
AWS IAM Role for the AWS Step Functions state machine to execute.
AWS Lambda functions to perform the tasks involved.
AWS Step Functions state machine with definition.
Amazon CloudWatch log group to capture the state machine logs.

Create an AWS CloudFormation stack in the account/Region where you’ve deployed the solution provided in part one.

Select here to launch the stack. This will automatically launch the AWS CloudFormation console with a template. If you’re not logged into your AWS account, you will be prompted to sign in.
Select the Region where you want to deploy this solution.
You are required to provide a Stack name and the following parameters for the solution to work, then select the check box I acknowledge that AWS CloudFormation might create IAM resource. When you’re ready, select Create stack.

a. LambdaExecutionRoleName – Name of the Lambda Execution Role created in part one of the blog series. You can retrieve this from the CloudFormation resources.

b. UserEmail – Email address of a User/Group to receive status notifications of the restore workflow.

Figure 3: Configuring stack names and parameters for the CloudFormation stack.

Figure 3: Configuring stack names and parameters for the CloudFormation stack

Test the solution

The solution we have deployed runs on-demand whenever there is a need to restore a new AMI from an archived EC2 backup recovery point. You can trigger the solution by navigating to the AWS Step Functions console, and by selecting the state machine which was just deployed in the previous section. Select Start execution. The execution time of the workflow depends on the selected S3 Glacier storage class.

Figure 4: Start an execution of AWS Step Functions state machine.

Figure 4: Start an execution of AWS Step Functions state machine

You will be prompted to enter values for the execution in JSON format as shown below.

Figure 5: Input values in the JSON format for the execution.

Figure 5: Input values in the JSON format for the execution

Sample payload:

{
    "recoveryPoint": "Enter the full S3 object key",
    "recoveryPointBucket": "Enter the name of the S3 bucket",
    "restoreRequestDays": "Enter in days the lifetime of the active copy",
    "restoreRequestTier": "Enter the retrieval tier – Standard or Bulk or Expedited"
}

recoveryPoint	The complete object key value of an archived EC2 backup recovery point.
recoveryPointBucket	The name of the S3 bucket created in part one of the blog series to store the archived EC2 backup recovery points. You can get the name from the CloudFormation resources.
restoreRequestDays	Integer value. Lifetime of the active copy in days. Default value is set to 1.
restoreRequestTier	Retrieval tier at which the restore will be processed. Allowed values are “Standard \| Bulk \| Expedited”.

Note that the expedited retrieval tier is not available for the S3 Glacier Deep Archive storage class. More information on archive retrieval options is available here.

Validate the solution

Upon successful completion, the workflow will send you an SNS notification with the information about the newly restored AMI Id and the AWS Region it is available in, as shown below. You can locate the AMI by navigating to the EC2 console and searching for the “Restored AMI ID” you received in the email from SNS notifications.

mFigure 6: Sample email a user/group receives from the SNS topic on the restore workflow status.

Figure 6: Sample email a user/group receives from the SNS topic on the restore workflow status

Cleaning up

To delete the resources that were created through this solution, navigate to the CloudFormation console, select the stack, and delete. All resources provisioned through this CloudFormation stack will be deleted. The AMIs restored through the restore workflow will not be removed, if you wish to remove then you can deregister the AMIs from the EC2 console.

Figure 7 - Deleting a CloudFormation stack

Figure 7: Deleting a CloudFormation stack

Recommendations

The entire archival and restore process we described in both parts of the blog are based on a single AWS Backup ‘backup vault’ within an AWS Region. At the moment, if you have multiple backup vaults, you may need to deploy the solution per each backup vault. We encourage you to customize and extend the solution to support multiple backup vaults using tags, without re-deploying the entire solution again.

Conclusion

In this post, we started with a brief recap of archiving Amazon EC2 backup recovery points that we discussed in part one of the blog series. Next, we completed a workflow to restore an archived EC2 backup recovery point from an Amazon S3 Glacier archival storage class. Then, we completed a walkthrough of different use cases based on different Amazon S3 Glacier storage classes. This two-part blog series helps you maintain your AWS Backups (Amazon EC2 backups in this case) for a longer period of time in Amazon S3 by taking advantage of storing data in the long-term, secure, durable, and low-cost Amazon S3 Glacier storage classes.

Thanks for reading this blog post! Don’t hesitate to leave your feedback or comments in the comments section.