AWS Storage Blog
Manage costs for replicated delete markers in a disaster recovery setup on Amazon S3
Many businesses recognize the critical importance of safeguarding their essential data from potential disasters such as fires, floods, or ransomware events. Designing an effective disaster recovery (DR) strategy includes thoughtfully evaluating and selecting cost-effective solutions that fulfill compliance requirements.
By using Amazon S3 features such as S3 object tags, S3 Versioning, and S3 Lifecycle, you can design a cost-effective solution to preserve, retrieve, and restore data and meet compliance requirements. Additionally, you can use S3 Replication to replicate data, including delete markers, to a secondary AWS Region, which can help quickly recover and restore data in case of accidental deletion. Source and destination Regions may also retain delete markers for different retention periods and help you restore data quickly if necessary. While replicating delete markers, users should manage their data lifecycle to control associated costs.
In this post, we cover enabling S3 Versioning and applying object tags to versioned objects before then using S3 Replication to synchronize source and destination Region buckets with the same delete markers when an object is deleted. We then use the objects tags with S3 Lifecycle to manage deleted objects and automate their clean up at different specified intervals in the source and destination Regions. This approach helps protect your data by retaining delete markers and object versions on both the source and destination Regions, while also automating the clean-up process to save cost.
Solution overview
Replicating objects requires enabling object versioning, and it is important to set the Lifecycle policies to delete older versions. When a current object version expires, a delete marker is added to help restore the object from accidental deletion. As part of the S3 Replication rule, we replicate the delete marker to keep the source and destination buckets synchronized. We set tags on both the source and destination buckets, and then set Lifecycle policies to delete the object versions and delete markers.
At a high-level, the solution works as follows:
1. User or application delete the object(s) from the source bucket in the us-east-1 Region. This creates a delete marker for the deleted object(s).
2. S3 Replication replicates the delete marker from the source bucket into the destination bucket in the us-west-2 Region.
3. The S3 Lifecycle rule in both the source and destination would clean up the delete markers in the following sequence:
a. Non-current versions of the deleted object(s) are permanently deleted after the specific time configured in the S3 Lifecycle rule.
b. All expired object(s) delete markers are permanently deleted, thereby cleaning up the source and destination buckets.
Figure 1: S3 Lifecycle rule to manage delete marker
Prerequisites
Review the replication requirements to replicate objects to a different bucket.
Walkthrough
The following steps walk you through this solution:
1. Enable S3 Versioning on objects to retrieve them in case of accidental deletion.
2. Apply S3 object tags on those objects.
3. Set up an S3 Replication rule to replicate these objects to a different bucket.
4. Use the object tags created in S3 Lifecycle rules to manage the storage lifecycle of the versioned objects and to manage deleted objects.
5. Finally, test the source and destination bucket for object cleanup.
Step 1: Enable S3 Versioning
To set up object replication, one of the requirements is to set up versioning on the bucket. Set up S3 Versioning on the bucket by following the directions in the S3 User Guide.
Step 2: Apply S3 object tags
Tag the objects so that you have tags to use in your S3 Lifecycle rules. You can also choose to tag individual objects, by following the direction in the in the S3 User Guide.
Step 3: Replicate objects
To replicate S3 objects, start by following the directions in the S3 User Guide. Additionally, to meet compliance or business requirements for data replication, you can enable S3 Replication Time Control (RTC). This will replicate most objects that you upload to Amazon S3 in seconds, and 99.99 percent of those objects within 15 minutes. Make sure to enable the replication of delete markers and metadata changes to the destination bucket, as shown in Figure 2.
Figure 2: Enabling S3 Replication in the source bucket with options that replicate the delete marker and metadata changes
Step 4: Set up S3 Lifecycle rule
Set up the S3 Lifecycle rules on the source bucket and the destination bucket. In this step we consider the following two scenarios:
Scenario 1: You want to apply this rule across all objects in the source and destination bucket and there is no requirement to maintain non-current versions of the object.
1. Create the S3 Lifecycle rule with the Apply to all objects in the bucket option selected, as shown in Figure 3.1.
2. Select Permanently delete noncurrent versions of object and Delete expired object delete markers or incomplete multipart uploads from Lifecycle rule actions.
Figure 3.1: Enabling S3 Lifecycle configuration across all objects inside the bucket
3. Selecting Delete expired object delete markers or incomplete multipart uploads from the S3 Lifecycle rule actions provides multiple options, as shown in Figure 3.2, and you can choose Delete expired object delete markers.
4. You can also choose the number of days after which the object becomes non-current as 1, and leave the number of newer versions to maintain as empty or 0 to apply this rule immediately. Note that your business may need to maintain a non-current version for more than one day, and in that case you should configure the necessary number of days.
Figure 3.2: Selecting options to delete expired delete markers and creating the rule
Select Create rule to create this S3 Lifecycle policy that applies to all the objects in the bucket and removes older versions and delete markers.
Scenario 2: Apply this rule for only certain objects in the source and destination bucket. There is a requirement to maintain a minimum number of non-current versions of an object.
1. Create an S3 Lifecycle configuration with the option Limit the scope of this rule using one or more filters selected, as shown in Figure 4.1.
2. Use either the Prefix or the Object tags option under the Filter type section. The Prefix option can be used if you want to apply this rule to all objects under a certain prefix. In Figure 4.1 we have shown the Object tags option, which applies this rule to only the objects that are tagged with key = mark-for-delete and value = true. This is useful when you want to tag only selected objects for cleanup and maintain non-current versions of other objects intact.
Figure 4.1: Enabling the S3 Lifecycle rule that is limited only to the objects with the specified tag
3. Select Permanently delete noncurrent versions of objects under the Lifecycle rule actions section, as shown in Figure 4.2.
4. Specify the number of days after which the non-current versions of objects should be deleted. In Figure 4.2 we have used one day after the object becomes non-current.
Figure 4.2: Selection options to remove non-current versions of objects and create the rule
5. Select Create rule to create this rule.
For this scenario we create another rule to clean up the delete markers and apply it across all the objects. Select the Apply to all objects in the bucket option as shown in Figure 4.3.
Figure 4.3: Creating a second rule that is applicable for all objects in the bucket
1. Select the Delete expired object delete markers or incomplete multipart uploads option under the Lifecycle rule actions section, as shown in Figure 4.4.
2. Select the Delete expired object delete markers option under the Delete expired object delete markers or incomplete multipart uploads section.
Figure 4.4: Select the options to delete the expired delete markers and create the rule
3. Select Create rule and you should see two rules: one to expire non-current versions of objects tagged with key = mark-for-delete and value = true, and another rule to delete the expired object delete markers.
Step 5: Test object deletion on the source bucket
To test these S3 Lifecycle rules, complete the following steps:
1. Upload a few files into the source bucket along with versions, as shown in Figure 5.1.
Figure 5.1: Source bucket with list of objects and their versions
2. Delete one of the objects with the show version option turned off so that the current version is replaced with a delete marker, as shown in Figure 5.2.
Figure 5.2: Deleting one of the object’s current version
3. Confirm the object deletion, which would remove and add a delete marker to the current version of the object, as shown in Figure 5.3.
Figure 5.3: Confirmation for deleting one of the object’s current version
4. Once you delete the object, you should see the delete marker replacing the current version, as shown in Figure 5.4.
Figure 5.4: Source bucket with the list of objects and their versions and delete marker
5. You should see all the objects along with the versions and delete markers replicated to the destination bucket, as shown in Figure 5.5.
Figure 5.5: Target bucket with the replicated objects along with versions and delete marker from the source bucket
After the S3 Lifecycle rule executes, it expires the non-current versions and removes the delete marker from both the source and destination buckets. You can have different retention of the non-current versions and the delete markers on the source and destination bucket based on the lifecycle policy.
Cleaning up
If you followed along with this solution for testing purposes and want to avoid incurring future charges, then make sure to disable S3 Replication and S3 Versioning and delete S3 object tags and S3 Lifecycle rules.
Conclusion
In this post, we take an approach to preserve, retrieve, and restore critical S3 data in a cost-effective way by using AWS features such as S3 Versioning, S3 Replication, object tagging, and S3 Lifecycle. The key steps involve enabling versioning on the source bucket, tagging objects, configuring replication to another AWS Region, and creating Lifecycle rules to manage non-current versions and delete markers.
By thoughtfully combining Amazon S3’s capabilities, you can achieve reliable and cost-optimized data protection that meets business continuity needs and compliance requirements. The core benefit is the ability to have objects replicated to lower-cost S3 storage classes while retaining previous versions and delete-markers for a user-defined period. This allows for the cost-effective preservation, retrieval, and restoration of critical data to quickly recover from accidental deletions, malicious actions, or other data loss events.
We encourage you to leverage the Amazon S3 features outlined in this solution to safeguard their critical data in a cost-effective manner. By taking advantage of this approach, they can enhance their data protection and ensure business continuity, while also optimizing their cloud storage costs.