AWS Storage Blog

Delete multiple AWS Backup recovery points using AWS Tools for PowerShell

Backing up data is an essential part of enterprise data protection strategy, whether organizations need to comply with regulations or protect against ransomware attacks. Managing your data backups is equally as important as taking backups of your data. Part of data backup management is preventing stale “backups” (also knows as “recovery points”). Stale backups can include recovery points older than the required retention period, or recovery points for a resource that does not exist anymore. Timely management is crucial as it ensures that administrators don’t spend time testing and validating backups unnecessarily.

During the management process, backup administrators might need to clean up hundreds or thousands of backup recovery points, or more, depending on the size of the infrastructure. This is often cumbersome, tedious, and time-consuming work.

Let’s walk through how to accomplish proper management via AWS Backup. To avoid unscalable growth in a stale recovery point, a properly configured backup policy via AWS Backup is key. For example, the Amazon EBS storage for Amazon EC2 recovery points can grow quickly if the backup retention period is set to Always.

Amazon EBS storage for Amazon EC2 recovery points can grow quickly if the backup retention period is set to Always.

Fortunately, that configuration is easy to revise and update without a need for repeating the process of creating a backup plan. However, changing the retention period of a backup plan is not retroactive and applies only to the new recovery points. In this case, all recovery points taken previously will continue to follow the previous retention they inherited from the previous plan version.

Earlier this month AWS Backup launched added support for batch operations of backups. You can now delete multiple recovery points via the AWS Backup console. Customers who are looking to streamline their data protection processes can now leverage this functionality to take action on multiple backups at the same time. Moreover, a bulk deletion of multiple recovery points based on specific criteria, such as creation date or specific AWS resource, is easier to perform at scale in an automated fashion with the help of AWS Tools for PowerShell.

In this blog, I cover how to use AWS Tools for PowerShell to automate different use cases of cleaning up AWS Backup recovery points. The examples in this post illustrate working with recovery points for Amazon EC2 instances, Amazon RDS, Amazon DynamoDB tables, and Amazon FSx, but they are still applicable to recovery points for other AWS resources supported by AWS Backup.

AWS Backup basics

Before I jump into the use cases, I want to lay the foundation with a quick refresher focusing on the basic concepts of AWS Backup in order to avoid any confusion over various terminology. Customers use AWS Backup to centralize and automate data protection across AWS services. AWS Backup also supports backups across multiple accounts within AWS Organizations, and also supports cross-region as well as cross-account backups.

The core elements of AWS Backup are:

  • Backup vault: The container in which backups are stored. In the vault settings, you can select an AWS KMS encryption key to encrypt the backup file. You can also assign resource tags.
  • Backup plan: The policy that defines the various aspects of AWS Backup and how it should back up AWS resources, such as which resources to back up, at what backup frequency, and for what retention period, etc. A backup plan contains a backup rule, which is the subset configuration to define backup schedules, backups windows, and lifecycle rules.
  • Backup policies: The backup plan that enables, activates, and configures AWS Backup across AWS Organizations. A backup policy has almost the same configuration attributes as backup plans. A key note is when we use the term backup plan, we refer to the backup configuration at the AWS account level, while the term backup policies refer to backup configuration at the AWS organization level.
  • Recovery points: Each recovery point represents a successful backup of an AWS resource. Each recovery point has a unique ID, which can be used to manage multiple recovery points. The recovery point ID prefix varies based on the AWS resource type. For example, the recovery point ID for Amazon EC2 instances starts with image followed by the image ID (that is, image/ami-0ecdf967356c809c7). For Amazon EBS volumes, the recovery point ID starts with snapshot followed by the snapshot ID (that is, snapshot/snap-05f426fd8kdjb4224). For Amazon DynamoDB tables, the recovery point ID starts with table followed by the table name, followed by the backup ID (that is, table/MyDynamoDBTable/backup/01547087347000-c8b6kdk3). The AWS Backup recovery points documentation has a table that contains a list of the AWS resource types that AWS Backup supports and provides examples of their corresponding recovery point ID.

Prerequisites

To run the commands in AWS Tools for PowerShell outlined in this post, you need the following:

  • An AWS account.
  • An AWS Identity and Access Management (IAM) user with:
    • Permissions to manage AWS Backup operations. If your IAM user does not have full administrator access, then you can assign one of the AWS IAM managed policies explained in the AWS Backup access control documentation. We recommend using the AWSBackupFullAccess IAM managed policy to run the use cases and scripts explained in this post.
    • An active access key ID and secret access key.
  • AWS Backup configured with at least two recovery points, and one of them with the retention period set to Always. If you don’t have any available recovery points, you can create new recovery points quickly using the on-demand backup option.
  • AWS Tools for PowerShell.

Also, make sure that you have installed and configured AWS Tools for PowerShell before starting.

Getting started

The PowerShell module for AWS Backup has 50 cmdlets to automate different AWS Backup tasks, such as creating, modifying, and removing backup vaults, backup plans, backup jobs, and recovery points.

The main cmdlets I demonstrate in this tutorial are:

  • The Get-BAKBackupVaultList cmdlet to list all the available backup vaults under AWS Backup in a specific Region.
  • The Get-BAKRecoveryPointsByBackupVaultList cmdlet to list the recovery points stored in a specific backup vault.
  • The Remove-BAKRecoveryPoint cmdlet to remove recovery points.

The following scenarios demonstrate several use cases to use AWS Backup cmdlets to clean up your backup recovery points.

Removing recovery points with an infinite retention period

The objective of this scenario is to delete all recovery points that have an infinite retention period, in other words, the recovery points that have no expiry date. To identify those recovery points, look up the value of the recovery point lifecycle. The backup lifecycle contains information on the number of days before a recovery point is deleted or moved to cold storage.

If a recovery point has a defined retention period, then the following command’s output would show a number of days before the transition; otherwise, the lifecycle value is empty (null).

The first command lists all of the recovery points stored in the default backup vault of the eu-west-2 (London) Region.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default  -Region eu-west-2 | Out-GridView

You can get a grid view that is easy to filter and manipulate from the Out-GridView cmdlet. In the grid, notice that the CalculatedLifecycle column, which represents a recovery point’s property, is empty for some recovery points. This means that the recovery points have no lifecycle because their retention periods are set to Always.

Output of a command that shows the number of days before a transition if a recovery point has a defined retention period.

In the preceding screenshot, the CalculatedLifecycle property is showing Amazon.Backup.Model.CalculatedLifecycle but not the actual lifecycle information. This is because the CalculateLifecycle property is one of array data structure. To see the values inside the array, use the Select cmdlet with the -ExpandedProperty parameter, as illustrated in the following command and screenshot.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default  -Region eu-west-2 | Select -ExpandProperty Lifecycle

Without a defined recovery point, the lifecycle value is empyt (null)
With the possible values inside the recovery point lifecycle, you can filter the recovery points list to instances where the lifecycle property value equals null. Then, you can pass it to the Remove-BACRecoveryPoint cmdlet to remove them.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 | Where Lifecycle -eq $null | Remove-BAKRecoveryPoint -Region eu-west-2 -Force 

The preceding command lists the information of all the recovery points in a default backup vault in the eu-west-2 (London) Region that have no lifecycle information. You could then delete them.

Removing recovery points based on creation date

Another common use case is deleting recovery points created within a specific timeline. For instance, deleting all recovery points created before or after a specific date. Also, it can be combined with the preceding scenario to delete recovery points with an infinite retention period date along with recovery points that were created before a specific date. To achieve this goal, use the Get-BAKRecoveryPointsByBackupVaultList cmdlet with ByCreatedBefore and ByCreateAfter parameters.

Both parameters accept values of DateTime data type in multiple ways.

The first option is passing a specific date to ByCreatedBefore or ByCreatedAfter. For example, the following command lists all recovery points created after January 15, 2021:

$date = [DateTime]"15-JAN-2021"
Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 
-ByCreatedAfter $date 

list all recovery points created after January 15, 2021 (2)

Alternatively, you can pass the number of past days, months, or years. The following command lists all recovery points created before 14 days from today:

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 
-ByCreatedBefore (Get-Date).AddDays(-14)

list all recovery points created before 14 days from today.

To complete the task and remove the retrieved list of recovery points, you must pass the results to the Remove-BACRecoveryPoint cmdlet using the pipeline operator.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default  -Region eu-west-2 -ByCreatedBefore (Get-Date).AddDays(-14) | Remove-BAKRecoveryPoint -Region eu-west-2 
-Force 

Additionally, you can combine ByCreatedBefore or ByCreatedAfter arguments in one single command to have a more precise creation date. As illustrated in the following command, it will list only the recovery points created on February 22, 2021.

$beforeDate = [DateTime]"02/23/2021"
$afterDate = [DateTime]"02/21/2021"
 
Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName EC2 -Region eu-west-2 -ByCreatedAfter $afterDate -ByCreatedBefore $beforeDate | select CreationDate, ResourceType, ResourceArn

list only the recovery points created on February 22, 2021.

Removing recovery points for a specific resource

When AWS resources (that is, Amazon EC2 instances) get removed, the backup recovery points remain in the backup vault. You could restore the removed Amazon EC2 instances if deleted by mistake. However, in some valid scenarios, you may not need the resource, or its recovery points, anymore.

In that case, use the ResourceArn parameter to get the list of recovery points associated with a specific AWS resource, then delete them. The ‘resource’ means the source Amazon EC2 instance, Amazon DynamoDB, or any other AWS resource supported by AWS Backup that is used to take the recovery point. If you know the ResourceArn, then pass it to the ByResourceArn parameter as in the following example.

$resourceArn = "arn:aws:dynamodb:eu-west-2:111111222222:table/AWSStorageBlogs"

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 
-ByResourceArn $resourceArn | Remove-BAKRecoveryPoint -Region eu-west-2 -Force

An easy way to list all AWS resources linked to recovery points is by listing all the recovery points in a specific vault, and then group the results by the ResourceArn property. The ResourceArn value ends with the AWS resource identifier. For instance, the ResourceArn for an Amazon EC2 instance ends with an instance ID, for Amazon RDS it ends with a database name, for Amazon FSx it ends with the Amazon FSx file system ID, and so on.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 
| group ResourceArn | select Name

Group the results by the ResourceArn property

You can also use the ResourceArn value as a reference to filter our list of recovery points and lookup for the ResourceArn value that contains an identifier of an AWS resource. For example, the command to get recovery points for the Amazon DynamoDB table called “AWSStorageBlogs” would look like the following:

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 | Where ResourceArn -like *AWSStorageBlogs* | Remove-BAKRecoveryPoint -Region eu-west-2 
-Force 

lookup for the ResourceArn value that contains an identifier of an AWS resource

Similar to the ResourceArn parameter, the Get-BAKBackupVaultList cmdlet has a parameter to list all recovery points based on the AWS resource type (that is, Amazon EC2, Amazon RDS, Amazon FSx, Amazon DynamoDB, etc.). This parameter is useful in a scenario where you want to remove all recovery points related to a specific resource type. For example, you have recently modernized and re-architected a legacy application and migrated from Amazon EC2 to AWS Lambda. In that case, you don’t need the Amazon EC2 instances nor their recovery points. Accordingly, you can remove any recovery points related to any Amazon EC2 instance irrespective to the ResourceArn. The following command (and screenshot) shows the AWS resource types that are being protected by AWS Backup recovery points, and you can see there are Amazon EC2, Amazon RDS, Amazon FSx, and DynamoDB resources:

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 | group ResourceType 

The Get-BAKBackupVaultList cmdlet has a parameter to list all recovery points based on the AWS resource type

Let’s say I want to remove all the recovery points that belong to EC2 instances. I can use the following command to list all the existing recovery points, filter that list based on the ResourceType property, which is EC2 in this case, and finally remove them.

Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName Default -Region eu-west-2 | Where ResourceType -eq EC2 | Remove-BAKRecoveryPoint -Region eu-west-2 
-Force

Removing recovery points in multiple backup vaults

The preceding three use cases assume that all the recovery points are stored in the default backup vault. But, what if there are multiple backup vaults? The following code explains how to remove multiple recovery points with an infinite retention period, along with recovery points stored in multiple backup vaults. First off, I must get the list of all available backup vaults using the Get-BAKBackupVaultList cmdlet, to see if I have any vaults other than the default. The following screenshot illustrates the command and its results (I have the default backup vault and two other vaults):

Get-BAKBackupVaultList -Region eu-west-2 

Removing recovery points in multiple backup vaults

You can also remove multiple recovery points that have an infinite retention period, along with recovery points stored in multiple backup vaults.

First off, get the Backupvaults names list using the Get-BAKBackupVaultList cmdlet. Then, iterate through the names list using the ForEach loop. Then, inside the ForEach loop, use the Remove-BAKRecoveryPoint cmdlet to remove the recovery points inside that vault.

$BackupVaults = Get-BAKBackupVaultList -Region eu-west-2 

Foreach($vault in $BackupVaults)
{
    Get-BAKRecoveryPointsByBackupVaultList -BackupVaultName $vault.BackupVaultName 
-Region eu-west-2 | Where Lifecycle -eq $null | Remove-BAKRecoveryPoint -Region eu-west-2 -Force 

} 

Cleaning up

After completing the commands outlined in the post, you may want to deactivate and remove the secret key you created earlier for the IAM user to avoid incurring unintended charges.

For more information about managing access keys via the AWS Management Console, refer to the IAM documentation.

Conclusion

In this blog post, I covered using AWS Tools for PowerShell to automate the bulk deletion of recovery points. I reviewed three different use cases to delete recovery points based on creation date, retention period settings, and associated resources. In addition to the cmdlets I used in the examples, the AWS Backup module for PowerShell provides around 50 cmdlets to automate various administrative tasks of the AWS Backup service.

By using AWS Tools for PowerShell to manage AWS Backup, customers no longer have to spend time and effort cleaning up recovery points. This greatly reduces management overhead, enabling administrators to focus on the other elements of their business and infrastructure, ultimately saving time and money.

To learn more about AWS Backup and AWS Tools for PowerShell, check out the following resources:

Thanks for reading this blog post. If you have any comments or questions, please don’t hesitate to leave a comment in the comments section.

Sherif Talaat

Sherif Talaat

Sherif is a Senior Solutions Architect at AWS based in Saudi Arabia. He supports AWS customers in their journey to modernize, transform, and migrate on-premises workloads to AWS.