AWS Storage Blog
Reduce storage costs with fewer noncurrent versions using Amazon S3 Lifecycle
Keeping multiple copies and versions of data is a tried-and-true security and data protection protocol. In the event that one version is harmed or corrupted, another is ready as a backup. While increased security with multiple versions and copies is a plus, the added storage costs of (purposefully) redundant data must be carefully considered. It can often be difficult to manage the costs of such a versioning operation.
Amazon S3 Versioning protects your data from accidental overwrites or deletions by keeping multiple variations of an object in the same bucket, which is really useful for security and data protection. However, what if you were concerned about costs and decide not to use S3 Versioning?
On Amazon S3, uploading an object with the same key will result in an overwrite. So, it will definitely decrease the storage cost of future uploads, since S3 will keep only the latest object version. However, in the case of human errors, such as uploading the same key to an existing object, the existing object is overwritten, hence removing it. There is no way to recover permanently removed objects. That is why customers are advised to enable versioning in the Amazon S3 security best practices.
In this blog, we introduce how you can secure both data protection through object versioning and manageable data storage costs using S3 Lifecycle configuration. We walk through how you can protect your data by maintaining object versioning while lowering storage cost by retaining fewer noncurrent versions, giving you both cost-efficiency and data security.
Find bytes of noncurrent versions that are taking storage costs
Amazon S3 Storage Lens helps you to identify bytes of noncurrent version per bucket that are taking storage costs. You can also check the actual noncurrent object versions by turning on Show versions in the Amazon S3 console. For a same key object, except for the only one latest modified object, all others are noncurrent versions. In the following screenshot example, there is one current version and six noncurrent versions for the key images/product-detail.png.
The Versioning-enabled bucket keeps all the versions per each upload. It’s also possible that each object version might be slightly different from the others. But in many cases, retaining so many noncurrent versions is not required. For instance, with Amazon S3 File Gateway, a file change will trigger upload to S3, causing another version to be created. In such a scenario, numerous noncurrent versions may exist, causing unnecessary charges. To prevent this, you can create an S3 Lifecycle rule to keep only the two newest noncurrent versions, thereby decreasing the cost of having too many noncurrent versions.
Create an S3 Lifecycle rule to keep a smaller number of object versions
Now let’s create an S3 Lifecycle rule to keep only a specific number of object versions. We make a rule that deletes noncurrent objects after one day of being noncurrent in order to save on storage costs. However, there are some possible situations in which we must look for a noncurrent version; for instance, if the latest version of the object is a corrupted file. Therefore, we will make an exception to retain two newer versions for data safety.
We limit the scope to a specific Prefix, images/, by using the filter prefix instead of applying it to the entire bucket.
Select Permanently delete noncurrent versions of objects in Lifecycle rule actions.
The next menu displays two fields.
- Days after objects become noncurrent – After the specified number of days since the object became noncurrent, it will be permanently removed.
- Number of newer versions to retain – The specified number of newer versions will be retained regardless of how many days they have been noncurrent.
To specify the shortest possible time an object stays in the bucket, enter “1” in Days after objects become noncurrent. To keep only the two newest noncurrent versions, specify “2” in Number of newer versions to retain. Depending on your use case, you may adjust those numbers.
After filling in the previous fields, a new screen will display what will happen to objects satisfying this lifecycle rule.
Since we specified only the noncurrent versions rule, the Noncurrent versions actions box on the right of the screen is the section we need to check. You can verify that only the two newest noncurrent versions will be retained, and all other noncurrent versions will be permanently deleted after one day of being noncurrent.
Once you have confirmed your settings, go ahead and create this rule by selecting Create rule.
What happens after the new S3 Lifecycle rule is created?
Using the preceding configuration, S3 Lifecycle will permanently delete noncurrent objects after the number of days specified in the Days after objects become noncurrent field, except for the number of newer objects specified in Number of newer versions to retain.
There could be a delay between when the lifecycle rule is satisfied and when the action for the rule is complete. Changes in billing are applied when the lifecycle rule is satisfied, even if the action hasn’t been completed.
Once the lifecycle rule action is completed, you can expect to see one current version and two noncurrent versions unless any manual upload or removal happened to this key. You can verify what happened to the lifecycle satisfying objects by turning on Show versions in the S3 console.
The following screenshot shows that there is now only one current version and two noncurrent versions after the lifecycle rule completed. Without the lifecycle rule, you would pay for all the noncurrent versions that were retained.
In this post, we explained how you can not only minimize S3 storage costs but also ensure data safety by maintaining object versioning. You can find buckets that have a large percentage of noncurrent objects using S3 Storage Lens. With those buckets, you can create a S3 Lifecycle rule that permanently removes noncurrent versions, except for the number of specific newest noncurrent versions. This can help you decrease storage costs and optimize spending by retaining only fewer noncurrent objects instead of storing all noncurrent versions, which would be unnecessary.
Thanks for reading about Amazon S3 storage cost optimization on versioning enabled bucket. If you have any comments or questions, leave them in the comments section.