This Guidance shows how to automate the complex and repetitive tasks associated with deleting data stored in an Amazon S3 Glacier vault. It handles the entire process of downloading the S3 Glacier vault inventory and emptying the vault of its archives. The inventory is then downloaded, split into smaller chunks, and for each chunk, the solution submits multiple concurrent requests to delete all the archives in the list. Once all the archives have been successfully deleted, the S3 Glacier vault itself can then be deleted through a separate process. This automated approach helps to streamline the data deletion workflow, reduce the risk of human error, and help ensure that S3 Glacier vaults are properly maintained on a regular basis.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • The detailed deployment and workflow status information provided by CloudFormation, Step Functions, Lambda, and Amazon SNS assists in the identification of potential issues and includes detailed messages to facilitate root cause analysis. For example, CloudFormation enables the deployment and visibility of all the created resources, allowing for the tracking of their deployment status. Step Functions provides visibility into the individual steps of the workflow, enabling the tracking of the Lambda function invocations and the overall performance and status of the process. Lambda also writes invocation and other operational events to Amazon CloudWatch Logs, while Amazon SNS informs the user through email of the workflow's status.

    Read the Operational Excellence whitepaper 
  • AWS Identity and Access Management (IAM) is scoped to provide the minimum permissions required by each component of this Guidance. IAM is designed to enable fine-grained access control to AWS resources and their associated actions. For instance, each Lambda function is granted only the permissions necessary to perform its designated task. IAM was used in this Guidance to achieve this level of granular access control.

    Read the Security whitepaper 
  • This selection of AWS services was driven by their capability to address the specific requirements of this Guidance, such as error handling, scalability, message delivery, and data storage. For example, the use of Step Functions is attributed to its robust error handling capabilities, enabling it to manage throttling, the AWS Software Development Kit (AWS SDK), service errors, and timeout errors. Lambda functions are also employed due to their scalability and high availability characteristics. In addition, Amazon SNS is used to reliably deliver messages to the Lambda service, thereby invoking the necessary functions. Lastly, Amazon S3 provides high-performance, as well as reliable and durable storage.

    Read the Reliability whitepaper 
  • The services selected allow this Guidance to optimize its performance and operate at scale, using a purely serverless infrastructure. Specifically, the Step Functions distributed map is used to orchestrate the parallel execution of tasks, such as Lambda functions. The AWS SDK running within the Lambda functions enables the processing of multiple parallel API requests, while Athena is utilized to query and process large amounts of data at scale using simple SQL queries.

    Additionally, Amazon S3 provides high-performance, scalable object storage to access the downloaded S3 Glacier vault inventory. The collective capabilities of these AWS services enable this Guidance to quickly and efficiently fulfill its intended function of emptying an S3 Glacier vault without the need to provision and manage large-scale Amazon Elastic Compute Cloud (Amazon EC2) instances or develop custom, complex scripts.

    Read the Performance Efficiency whitepaper 
  • Amazon S3 provides reliable and cost-effective storage for the S3 Glacier vault inventory, with the ability to enable lifecycle rules to expire unused data. Additionally, Step Functions offers a serverless and cost-effective workflow mechanism to orchestrate tasks, while Lambda provides scalable serverless compute.

    Athena enables querying and splitting of large data sets without expensive compute resources, and Amazon SNS publishes messages to subscribers in a cost-effective manner.

    Together, these AWS services deliver a comprehensive, serverless framework for managing the cost-effective storage, workflow orchestration, compute scaling, and data processing required to efficiently empty and delete S3 Glacier vaults.

    Read the Cost Optimization whitepaper 
  • The combined use of Amazon S3, Lambda, Amazon SNS, and Step Functions allows this Guidance, when configured, to sustainably provide data lifecycle management, serverless orchestration, message delivery, and compute resources to power the workflow. Meaning, Amazon S3 features a lifecycle management capability that automatically expires data deemed no longer necessary. And Lambda, an event-driven compute service that is provisioned and allocated only when required, optimizes energy usage. Lambda and Step Functions collectively provide serverless orchestration and compute resources to execute code on-demand in a sustainable manner. Finally, Amazon SNS delivers a serverless messaging service to facilitate communication between applications and their subscribers.

    Read the Sustainability whitepaper 
[Content Type]

[Title]

This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?