Q: Why is Amazon Glacier now called Amazon S3 Glacier?
Customers have long thought of Amazon Glacier, our backup and archival storage service, as a storage class of Amazon S3. In fact, a very high percentage of the data stored in Amazon Glacier today comes directly from customers using S3 Lifecycle policies to move cooler data into Amazon Glacier. Now, Amazon Glacier is officially part of S3 and will be known as Amazon S3 Glacier (S3 Glacier). All of the existing Glacier direct APIs continue to work just as they have, but we’ve now made it even easier to use the S3 APIs to store data in the S3 Glacier storage class.
Q: What is Amazon S3 Glacier?
Amazon S3 Glacier is an extremely low-cost storage service that provides secure, durable, and flexible storage for data backup and archival. With Amazon S3 Glacier, customers can reliably store their data for as little as $0.004 per gigabyte per month. Amazon S3 Glacier enables customers to offload the administrative burdens of operating and scaling storage to AWS, so that they don’t have to worry about capacity planning, hardware provisioning, data replication, hardware failure detection and repair, or time-consuming hardware migrations.
Q: How can businesses, government and other organizations benefit from Amazon S3 Glacier?
Amazon S3 Glacier enables any business or organization to easily and cost effectively retain data for months, years, or decades. With Amazon S3 Glacier, customers can now cost effectively retain more of their data for future analysis or reference, and they can focus on their business rather than operating and maintaining their storage infrastructure. Customers seeking compliance storage can deploy compliance controls using Vault Lock to meet regulatory and compliance archiving requirements.
Q: How should I choose between Amazon S3 Glacier and Amazon Simple Storage Service (Amazon S3)?
Amazon S3 is a durable, secure, simple, and fast storage service designed to make web-scale computing easier for developers. Use Amazon S3 if you need low latency or frequent access to your data. Use Amazon S3 Glacier if low storage cost is paramount, and you do not require millisecond access to your data.
Q: What kind of data can I store?
You can store virtually any kind of data in any format. You can also deploy compliance storage controls with Vault Lock to store regulatory and compliance archives in an immutable, Write Once Read Many (WORM) format. Please refer to the Amazon Web Services Licensing Agreement for details.
Q: What does Amazon do with my data in Amazon S3 Glacier?
Amazon will store your data and track its associated usage for billing purposes. Amazon will not otherwise access your data for any purpose outside of the Amazon S3 Glacier offering, except if required to do so by law. Please refer to the Amazon Web Services Licensing Agreement for details.
Q: How do I use Amazon S3 Glacier?
Amazon S3 now supports four new features to reduce your storage costs by making it even easier to build archival applications using the Amazon S3 Glacier storage class and by enabling one-click data replication to S3 Glacier in another AWS Region. S3 PUT to Glacier, S3 Cross-Region Replication to Glacier, S3 Restore Notifications, and S3 Restore Speed Upgrade are available using the S3 APIs, AWS Software Development Kits (SDKs), and AWS Management Console for simpler integration with your archival workloads and applications. To learn more, visit our Amazon S3 Developer Guide.
Amazon S3 Glacier provides a simple, standards-based REST web services interface as well as Java and .NET SDKs. The AWS Management console can be used to quickly set up Amazon S3 Glacier. Data can then be uploaded and retrieved programmatically. View our documentation for more information on the Glacier direct APIs and SDKs.
Q: How durable is Amazon S3 Glacier?
Amazon S3 Glacier is designed to provide average annual durability of 99.999999999% for an archive. The service redundantly stores data in multiple facilities and on multiple devices within each facility. To increase durability, Amazon S3 Glacier synchronously stores your data across multiple facilities before returning SUCCESS on uploading archives. S3 Glacier performs regular, systematic data integrity checks and is built to be automatically self-healing.
Q: How reliable is Amazon S3 Glacier?
Amazon S3 Glacier gives any developer access to the same highly scalable, highly available, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. Amazon S3 Glacier is designed for 99.99% availability and is backed by the Amazon S3 Service Level Agreement.
Q: What is the backend infrastructure supporting the S3 Glacier storage class?
In general, AWS does not disclose the backend infrastructure and architecture for our compute, networking, and storage services, as we are more focused on the customer outcomes of performance, durability, availability, and security. However, this question is often asked by our customers. We use a number of different technologies which allow us to offer the prices we do to our customers. Our services are built using common data storage technologies specifically assembled into purpose-built, cost-optimized systems using AWS-developed software. S3 Glacier benefits from our ability to optimize the sequence of inputs and outputs to maximize efficiency accessing the underlying storage.
Q: What is S3 Glacier Deep Archive?
S3 Glacier Deep Archive is a new Amazon S3 storage class that provides secure and durable object storage for long-term retention of data that is accessed once or twice in a year. From just $0.00099 per GB-month (less than one-tenth of one cent, or about $1 per TB-month), S3 Glacier Deep Archive offers the lowest cost storage in the cloud, at prices significantly lower than storing and maintaining data in on-premises magnetic tape libraries or archiving data off-site. To learn more about S3 Glacier Deep Archive, visit Amazon S3 FAQs.
Q: How is data within Amazon S3 Glacier organized?
You store data in Amazon S3 Glacier as an archive. Each archive is assigned a unique archive ID that can later be used to retrieve the data. An archive can represent a single file or you may choose to combine several files to be uploaded as a single archive. You upload archives into vaults. Vaults are collections of archives that you use to organize your data.
Q: How much data can I store?
There is no maximum limit to the total amount of data that can be stored in Amazon S3 Glacier. Individual archives are limited to a maximum size of 40 terabytes.
Q: What is the minimum amount of data that I can store using Amazon S3 Glacier?
There is no minimum limit to the amount of data that can be stored in Amazon S3 Glacier and individual archives can be from 1 byte to 40 terabytes.
Q: How much does Amazon S3 Glacier cost?
With Amazon S3 Glacier, storage is priced from $0.004 per gigabyte per month, and you pay for what you use. There are no setup fees, and for most archive use cases your total costs will primarily be made up of your storage cost.
Upload requests are priced from $0.05 per 1,000 requests. In addition, archives stored in S3 Glacier have a minimum 90 days of storage, and archives deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining days. As Amazon S3 Glacier is designed to store data that is infrequently accessed and long lived, these charges will likely not apply to most of you.
We charge less where our costs are less. Some prices vary across Amazon S3 Glacier Regions and are based on the location of your vault. There is no Data Transfer charge for data transferred between Amazon EC2 and Amazon S3 Glacier within the same Region. Data transferred between Amazon EC2 and Amazon S3 Glacier across all other Regions (e.g. between the Amazon EC2 Northern California and Amazon S3 Glacier US East North Virginia Regions) will be charged at Internet Data Transfer rates on both sides of the transfer.
To learn more about AmazonS3 Glacier pricing, please visit the Amazon S3 Glacier pricing page.
Q: How is my storage charge calculated?
The volume of storage billed in a month is based on the average storage used throughout the month, measured in gigabyte-months (GB-Months). The size of each of your archives is calculated as the amount of data you upload plus an additional 32 kilobytes of data for indexing and metadata (e.g. your archive description). This extra data is necessary to identify and retrieve your archive. Here is an example of how to calculate your storage costs using US East (Northern Virginia) Region pricing:
If you upload 100,000 archives that are 1 gigabyte each, your total storage would be:
1.000032 gigabytes for each archive x 100,000 archives = 100,003.20 gigabytes
If you stored the archives for 1 month, you would be charged:
100,003.20 GB-Months x $0.004 = $400.01
If you upload 200,000 archives that are 0.5 gigabytes each, your total storage would be:
0.500032 gigabytes for each archive x 200,000 archives = 100,006.40 gigabytes
If you stored the archives for 1 month, you would be charged:
100,006.40 GB-Months x $0.004 = $400.03
Your storage is measured in “TimedStorage-ByteHrs,” which are added up at the end of the month to generate your monthly charges. For example, if you store an archive that is 1 gigabyte (inclusive of the 32 kilobyte overhead) for one day in the US East (Northern Virginia) Region, your storage usage would be:
1,073,741,824 bytes x 1 day x 24 hours = 25,769,803,776 Byte-Hours
Converting this to GB-Months (assuming a 30 day month) gives:
25,769,803,776 Byte-Hours x (1 GB / 1,073,741,824 bytes) x (1 month / 720 hours) = 0.03 GB-Months
So your storage charge for that day would be:
0.03 GB-Months x $0.004 = $0.00012
To learn more about Amazon S3 Glacier pricing and view prices for other regions, please visit the Amazon S3 Glacier pricing page.
Q: Why do prices vary depending on which Amazon S3 Glacier Region I choose?
We charge less where our costs are less. For example, our costs are lower in the US East (North Virginia) Region than in the US West (Northern California) Region.
Q: How will I be charged and billed for my use of Amazon S3 Glacier?
There are no setup fees to begin using the service. At the end of the month, your credit card will automatically be charged for that month’s usage. You can view your charges for the current billing period at any time on the Amazon Web Services web site, by logging into your Amazon Web Services account, and clicking “Account Activity” under “Your Web Services Account”.
Q: How much data can I retrieve for free?
Amazon S3 Glacier offers a 10 GB retrieval free tier. You can retrieve 10 GB of your Amazon S3 Glacier data per month for free. The free tier allowance can be used at any time during the month and applies to Standard retrievals.
Q: How much does Amazon S3 Glacier cost?
There are three ways to retrieve data from Amazon S3 Glacier and each has a different per-GB retrieval fee and per-archive request fee (i.e. requesting one archive counts as one request). Expedited retrievals cost $0.03 per GB and $0.01 per request. Standard retrievals cost $0.01 per GB and $0.05 per 1,000 requests. Bulk retrievals cost $0.0025 per GB and $0.025 per 1,000 requests.
For example, using Expedited retrievals, if you requested 10 archives with a size of 1 GB each, the cost would be 10 x $0.03 +10 x $0.01 = $0.40.
If you were using Standard retrievals to retrieve 500 archives that were 1 GB each, the cost would be 500GB x $0.01 + 500 x $0.05/1,000 = $5.025
Lastly, using Bulk retrievals, if you were to retrieve 500 archives that are 1 GB each, the cost would be 500GB x $0.0025 + 500 x $0.025/1,000 = $1.2625.
To learn more about Amazon S3 Glacier pricing, please visit the Amazon S3 Glacier pricing page.
Q: How will I be charged when retrieving only a range of an archive?
Range retrievals are priced in precisely the same way as regular retrievals from Amazon S3 Glacier. You are charged a per-GB fee for only the amount of data retrieved in the range you specify.
Q: How will I be charged for deleting data that is less than 3 months old?
Amazon S3 Glacier is designed for use cases where data is retained for months, years, or decades. Deleting data from Amazon S3 Glacier is free if the archive being deleted has been stored for three months or longer. If an archive is deleted within three months of being uploaded, you will be charged an early deletion fee. In the US East (Northern Virginia) Region, you would be charged a prorated early deletion fee of $0.012 per gigabyte deleted within three months. So if you deleted 1 gigabyte of data 1 month after uploading it, you would be charged a $0.008 early deletion fee. If, instead you deleted 1 gigabyte after 2 months, you would be charged a $0.004 early deletion fee.
To view prices for other regions, please visit the Amazon S3 Glacier pricing page.
Q: What can I expect the total cost of ownership (TCO) to be?
Amazon S3 Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. Customers can reliably store large or small amounts of data for as little as $0.004 per gigabyte per month, a significant savings compared to on-premises solutions. To keep costs low yet suitable for varying retrieval needs, Amazon S3 Glacier provides three options for access to archives, from a few minutes to several hours. Your total cost of ownership (TCO) for your Amazon S3 Glacier storage will depend on your data access patterns. Below are several examples illustrating different use cases ranging from deep archives that are never retrieved to active workloads where large portions of data are accessed.
TCO example 1: Let’s assume that you upload 1 PB of data into Amazon S3 Glacier, that the average archive size is 1 GB and that you never retrieve any data. When you first upload the 1 PB, there are upload request fees of 1,048,576 GB x $0.05 / 1,000 = $52.43. Then the ongoing storage costs are 1,048,576 GB x $0.004 = $4,194.30 per month, or $50,331.65 per year.
TCO example 2: Now let’s assume the same storage as example 1 and also assume that you retrieve 3 TB (3,072 GB) a day on average using Bulk retrievals and that the average archive size was 1 GB for a total of 3,072 archives. That’s 90 TB retrieved per month or 8.8% of your data per month. The total retrieval fees per day would be 3,072 x $0.0025 + 3,072 * $0.025 / 1,000 = $7.76, which equates to $232.70 per month and $2,792.45 per year. Adding storage costs, your annual TCO is $50,331.65 + $2,792.45 = $53,124.10. In this example, retrieval fees make up just 5.3% of your total Glacier fees. Your total monthly cost per GB stored including retrieval fees is $0.004222/GB.
TCO example 2: Now let’s assume the same storage as example 1 and also assume that you retrieve 1 TB (1,024 GB) a day on average using Standard retrievals and that occasionally you use Expedited retrievals for urgent requests, averaging 10 GB per day. Here, we assume the average archive size is 1 GB. That’s 30.3 TB per month or 3% of your data per month. The total retrieval fees per day would be (1,024 x $0.01 + 1,024 x $0.05 / 1000) + (10 x $0.03 + 10 x $0.01) = $10.69, which equates to $320.74 per month and $3,848.83 per year. Adding storage costs, your annual TCO is $50,331.65 + $3,848.83 = $54,180.48. In this example, retrieval fees make up just 7.1% of your total Glacier fees. Your total monthly cost per GB stored including retrieval fees is $0.0043/GB.
To learn more about Amazon S3 Glacier pricing, please visit the Amazon S3 Glacier pricing page.
Q: How will multipart upload requests to the S3 Glacier storage class appear on my bill?
For Initiate multipart and Upload part, you will be charged at S3 Standard PUT and POST request rates. For Complete multipart, you will be charged the S3 Glacier PUT and POST request rate.
Q: How will in-progress multipart uploads to the S3 Glacier storage class appear on my bill?
In-progress multipart parts for a PUT to the S3 Glacier storage class are billed as S3 Glacier Staging Storage at S3 Standard storage rates until the upload completes. Deleted in-progress multipart parts will not be subject to an S3 Glacier early delete fee. The 90 day early-delete window starts from the time the multipart upload is completed.
Q: How do I control access to my data?
By default, only you can access your data. In addition, you can control access to your data in Amazon S3 Glacier by using the AWS Identity and Access Management (AWS IAM) service. You simply set up an AWS IAM policy that specifies which users within an account have rights to operations on a given vault.
Q: Is my data encrypted?
Yes, all data in the service will be encrypted on the server side. Amazon S3 Glacier handles key management and key protection for you. Amazon S3 Glacier uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256). 256-bit is the largest key size defined for AES. Customers wishing to manage their own keys can encrypt data prior to uploading it.
Q: Does Amazon S3 Glacier support IAM permissions?
Yes, S3 Glacier will support API-level permissions through AWS Identity and Access Management (IAM) service integration
For more information about IAM, go to:
Archives and vaults
Q: What is an archive?
An archive is a durably stored block of information. You store your data in Amazon S3 Glacier as archives. You may upload a single file as an archive, but your costs will be lower if you aggregate your data. TAR and ZIP are common formats that customers use to aggregate multiple files into a single file before uploading to Amazon S3 Glacier. The total volume of data and number of archives you can store are unlimited. Individual Amazon S3 Glacier archives can range in size from 1 byte to 40 terabytes. The largest archive that can be uploaded in a single Upload request is 4 gigabytes. For items larger than 100 megabytes, customers should consider using the Multipart upload capability. Archives stored in Amazon S3 Glacier are immutable, i.e. archives can be uploaded and deleted but cannot be edited or overwritten.
Q: How do I delete archives?
You can delete an archive at any time. You will stop being billed for your archive when your delete request succeeds at which point the archive itself will be inaccessible. Archives that are deleted within 3 months of being uploaded will be charged a deletion fee (see billing section for more details).
Q: How do I upload large archives?
When uploading large archives (100MB or larger), you can use multi-part upload to achieve higher throughput and reliability. Multi-part uploads allow you to break your large archive into smaller chunks that are uploaded individually. Once all the constituent parts are successfully uploaded, they are combined into a single archive.
Q: What is a vault?
A vault is a way to group archives together in Amazon S3 Glacier. You organize your data in Amazon S3 Glacier using vaults. Each archive is stored in a vault of your choice. You may control access to your data by setting vault-level access policies using the AWS Identity and Access Management (IAM) service. You can also attach notification policies to your vaults. These enable you or your application to be notified when data that you have requested for retrieval is ready for download. Click here to learn more about setting up notifications using the Amazon Simple Notification Service (Amazon SNS).
Q: How many vaults can I create?
You can create up to 1,000 vaults per account per region.
Q: How do I effectively manage my Amazon S3 Glacier vaults?
Amazon S3 Glacier allows you to tag your Glacier vaults for easier resource and cost management. Tags are labels that you can define and associate with your vaults, and using tags adds filtering capabilities to operations such as AWS cost reports. For example, you can use tags to allocate S3 Glacier costs and usage across multiple departments in your organization or by any other categorization. You can tag your vaults by using the S3 Glacier Console or the Amazon Glacier direct APIs. For more information see Tagging Your Amazon S3 Glacier Vaults.
Q: How do I delete a vault?
You may delete any S3 Glacier vault that does not contain any archives using the AWS Management Console, the Amazon Glacier direct APIs or the SDKs. Once a vault has been deleted, you can then re-create a vault with the same name. If your vault contains archives, you must delete all the archives before deleting the vault.
Vault access policies
Q: What is a vault access policy?
A vault access policy is a resource-based policy that you can attach directly to your S3 Glacier vault (the resource) to specify who has access to the vault and what actions they can perform on it. To learn more please read Managing Vault Access Policies in the Amazon S3 Glacier developer’s guide.
Q: How are vault access policies different from access control based on AWS Identity and Access Management (IAM) policies?
Access permissions can be assigned in two ways: as user-based permissions or as resource-based permissions. Access control based on IAM policies is user-based where you would assign IAM policies to IAM users or groups to control the read, write, and delete permissions on your S3 Glacier vaults. Access control with vault access policies is resource-based where you would attach an access policy directly on a vault to govern access to all users. Vault access policies can make certain use cases simpler. For example, to protect information in a business-critical vault from unintended deletion, you can create a vault access policy that denies delete attempts from all users. This data protection procedure can be accomplished in a matter of minutes in the AWS Management Console without having to audit and revoke delete permissions assigned to users through IAM policies.
Q: Can I use vault access policies to manage cross-account access?
Yes you can. For example, you can grant read-only access on your vault to a business partner in a different AWS account by simply adding that account to the vault’s access policy and specifying that only read activities are allowed.
Q: How does billing work in a cross-account access scenario?
The vault owner’s account will be billed for the charges incurred during cross-account access. For example, Alice (account A) grants Bob (account B) access to Alice’s “movies” vault and allows Bob to upload data. After Bob makes 1000 requests to upload 1GB of data, Alice’s account (account A) will be billed for the 1000 requests as well as the 1GB of data until the data is deleted. Bob’s account (account B) will not incur these charges.
Q: What is Vault Lock?
Vault Lock allows you to easily deploy and enforce compliance controls on individual S3 Glacier vaults via a lockable policy (Vault Lock policy). Once locked, the Vault Lock policy becomes immutable and S3 Glacier will enforce the prescribed controls to help achieve your compliance objectives. To learn more, please read Amazon S3 Glacier Vault Lock in the Amazon S3 Glacier developer’s guide.
Q: What type of compliance controls can I deploy with Vault Lock?
You can deploy a variety of compliance controls in a Vault Lock policy using the AWS Identity and Access Management (IAM) policy language. For example, you can easily set up “Write Once Read Many” (WORM) or time-based records retention for regulatory archives. To learn more, please read Amazon S3 Glacier Vault Lock in the Amazon S3 Glacier developer’s guide.
Q: How does Vault Lock enforce my compliance controls?
Vault Lock enforces your compliance controls via a lockable policy (Vault Lock policy). Once locked, the Vault Lock policy becomes immutable and S3 Glacier will only allow operations on your data that are explicitly permitted by the compliance controls you specified. Vault Lock also ensures that a locked policy cannot be deleted or altered until there are no more archives to protect in the vault. Learn more about Locking a Vault for compliance in the Amazon S3 Glacier developer’s guide.
Q: How is a Vault Lock policy different than a vault access policy?
Both policies govern access controls to your vault, however, a Vault Lock policy can be made immutable and provides strong enforcement for your compliance controls. You can use the Vault Lock policy to deploy regulatory and compliance controls that are typically restrictive and are “set and forget” in nature. In conjunction, you can use the vault access policy to implement access controls that are not compliance related, temporary, and subject to frequent modification. The two policies can be used in tandem to achieve governance and flexibility.
Q: What AWS electronic storage services have been assessed based on financial services regulations?
For customers in the financial services industry, Vault Lock provides added support for broker-dealers who must retain records in a non-erasable and non-rewritable format to satisfy regulatory requirements of SEC Rule 17a-4(f), FINRA Rule 4511, or CFTC Regulation 1.31. You can easily designate the records retention time frame to retain regulatory archives in the original form for the required duration, and also place legal holds to retain data indefinitely until the hold is removed.
Q: What AWS documentation supports the SEC 17a-4(f)(2)(i) and CFTC 1.31(c) requirement for notifying my regulator?
Provide notification to your regulator or “Designated Examining Authority (DEA)” of your choice to use Amazon S3 Glacier for electronic storage along with a copy of the Cohasset Assessment. For the purposes of these requirements, AWS is not a designated third party (D3P). Be sure to select a D3P and include this information in your notification to your DEA.
Q: What other controls can be applied with Amazon S3 Glacier Vault Lock?
In certain situations, you may be faced with the need to place a legal hold on your compliance archives for an indefinite period of time. A legal hold can be initiated on a S3 Glacier Vault by creating a vault access policy that denies the use of Glacier’s Delete functions if the vault is tagged in a particular way. In addition to time-based retention and legal hold, Glacier Vault Lock can be used to implement a variety of compliance controls which can be made immutable for strong governance, such as enforcing Multifactor Authentication on all data access/read activities to a vault with classified information.
Q: How can I retrieve data from the service?
When you make a request to retrieve data from S3 Glacier, you initiate a retrieval job for an archive. Once the retrieval job completes, your data will be available to download or access it using Amazon Elastic Compute Cloud (Amazon EC2) for 24 hours. There are three options for retrieving data with varying access times and cost: Expedited, Standard, and Bulk retrievals.
Q: What are Standard retrievals?
Standard retrievals allow you to access any of your archives within several hours. Standard retrievals typically complete within 3 – 5 hours.
Q: How do I use Standard retrievals?
To make a Standard retrieval, set the “Tier” parameter in the InitiateJob API request to “Standard”. If no tier is specified, the request will default to Standard.
Q: How much do Standard retrievals cost?
Standard retrievals are priced at a flat rate of $0.01 per GB and $0.05 per 1,000 requests. For example, retrieving 500 archives that are 1 GB each would cost 500GB x $0.01 + 500 x $0.05/1,000 = $5.025
Q: When should I use Standard retrievals?
Standard retrievals are a low-cost way to access your data within just a few hours. For example, you can use Standard retrievals to restore backup data, retrieve archived media content for same-day editing or distribution, or pull and analyze logs to drive business decisions within hours.
Q: What are Bulk retrievals?
Bulk retrievals are S3 Glacier’s lowest-cost retrieval option, enabling you to retrieve large amounts, even petabytes, of data inexpensively in a day. Bulk retrievals typically complete within 5 – 12 hours.
Q: How do I use Bulk retrievals?
To make a Bulk retrieval, set the “Tier” parameter in the InitiateJob API request to Bulk.
Q: How much do Bulk retrievals cost?
Bulk retrievals are priced at a flat rate of just $0.0025 per GB and $0.025 per 1,000 requests. For example, retrieving 500 archives that are 1 GB each would cost 500GB x $0.0025 + 500 x $0.025/1,000 = $1.2625.
Q: When should I use Bulk retrievals?
Bulk retrievals are designed to enable customers to cost-effectively pull large amounts of data for non-urgent use cases such as transcoding petabytes of raw video content or analyzing large genomics sequences.
Q: What are Expedited retrievals?
Expedited retrievals allow you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. There are two types of Expedited retrievals: On-Demand and Provisioned. On-Demand requests are like EC2 On-Demand instances and are available the vast majority of the time. Provisioned requests are guaranteed to be available when you need them.
Q: What is a Provisioned capacity unit?
Provisioned Capacity guarantees that your retrieval capacity for Expedited retrievals will be available when you need it. Each unit of capacity ensures that at least 3 expedited retrievals can be performed every 5 minutes and provides up to 150MB/s of retrieval throughput.
Q: When should I provision retrieval capacity?
Retrieval capacity can be provisioned if you have specific Expedited retrieval rate requirements that need to be met. Without provisioned capacity, Expedited retrieval requests will be accepted if capacity is available at the time the request is made.
Q: How do I purchase provisioned capacity?
You can purchase provisioned capacity using the console, SDK, or the CLI.
Q: How much does provisioned capacity cost?
Each unit of provisioned capacity costs $100 per month from the date of purchase.
Q: How do I use Expedited retrievals?
To make an Expedited retrieval, set the “Tier” parameter in the InitiateJob API request to Expedited. There is no need to designate whether an Expedited retrieval is On-Demand or Provisioned. If you have purchased provision capacity, then all Expedited retrievals will be automatically be served via your Provisioned capacity.
Q: How much do Expedited retrievals cost?
Expedited retrievals are priced at a flat rate of $0.03 per GB and $0.01 per request. For example, retrieving 10 objects with a size of 1GB each, the cost would be 10 x $0.03 +10 x $0.01 = $0.40.
Q: When should I use Expedited retrievals?
Expedited retrievals are optimized for the occasional urgent request for a small number of archives. For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. If your application or workload requires a guarantee that your Expedited retrievals will be available when you need it, then you should consider using Provisioned capacity.
Q: Can I retrieve part of an archive?
Yes, range retrievals enable you to retrieve a specific range of an archive. Range retrievals are similar to regular retrievals in Amazon S3 Glacier. Both require the initiation of a retrieval job (See How can I retrieve data? for more information). You can use range retrievals to reduce or eliminate your retrieval fees (See How much data can I retrieve for free?)
When initiating a retrieval job using range retrievals, you provide a byte range that can start at zero (which would be the beginning of your archive), or at any 1MB interval thereafter (e.g. 1MB, 2MB, 3MB, etc). The end of the range can either be the end of your archive or any 1MB interval greater than the beginning of your range.
Q: Why would I retrieve only a range of an archive?
There are several reasons why you might choose to perform a range retrieval. For example, you may have aggregated several files and uploaded them as a single archive. You may then need to retrieve a small selection of those files, in which case you could retrieve only the ranges of the archive that contained the required files. Another reason you could choose to perform a range retrieval is to manage how much data you download from Amazon S3 Glacier in a given period. When you make a request to retrieve data from S3 Glacier, you initiate a retrieval job for an archive. Once the retrieval job completes, your data will be available to download or access using Amazon Elastic Compute Cloud (Amazon EC2) for 24 hours. The data retrieved is then available for download for 24 hours. You could therefore retrieve an archive in parts in order to manage the schedule of your downloads.
Q: How do I view my jobs?
You can list your ongoing jobs for any of your vaults by calling the ListJobs API. The list of jobs provides information including the job’s creation time and date and the job’s status (e.g. in-progress, completed successfully, or not in which case reasons for the job not succeeding are provided). The progress of a single job can be tracked by calling the DescribeJob API and providing the corresponding job ID. The status of the job will be returned immediately.
Data retrieval policies
Q: What are data retrieval policies?
Amazon S3 Glacier data retrieval policies let you define your own data retrieval limits with a few clicks in the AWS console. You can limit retrievals to “Free Tier Only”, or if you wish to retrieve more than the free tier, you can specify a “Max Retrieval Rate” to limit your retrieval speed and establish a retrieval cost ceiling. In both cases, Amazon S3 Glacier will not accept retrieval requests that would exceed the retrieval limits you defined. Retrieval policies apply to Standard retrievals.
To learn more please read Configuring Data Retrieval Policies in the Amazon S3 Glacier developer’s guide.
Yes. You can set one data retrieval policy for each AWS region which will govern all data retrieval activities in the region under your account. Data retrieval policies are region-specific because data retrieval costs vary across AWS regions.
Please visit Amazon S3 Glacier Pricing for more information.
Q: Can I use data retrieval policies to “slow down” my retrievals or spread them out?
No, data retrieval policies such as “Free Tier Only” and “Max Retrieval Rate” will not accept a data retrieval request which would exceed your predefined data retrieval limit to help you manage data retrieval cost. Data retrieval policies will not change the 3 to 5 hour data retrieval latency or spread out your retrievals. You can leverage Amazon S3 Glacier’s range retrieval feature to spread out retrievals and lower the peak retrieval speed. Learn more.
Q: What impact does the change in the retrieval free tier to 10 GB per month have on my data retrieval policy?
There is no impact to your policy in terms of the rate of data retrieval. If your retrieval policy was set to the previous free tier of 5% of your average monthly storage prior to the change in the retrieval free tier on November 21, 2016, your policy will remain the same GB-per-hour retrieval rate as your previous 5% free tier rate as of November 21, 2016. For example, if on that day your average monthly storage was 14,400 GB, your retrieval rate limit was 14,400 GB x 5% / 30 day / 24 hours = 1 GB per hour. Your new policy will remain at 1 GB per hour, but will be a “Max Retrieval Rate” rather than a “Free Tier Only” policy.
Q: Can I see what archives I have stored in Amazon S3 Glacier?
Yes. Although you will need to maintain your own index of data you upload to Amazon S3 Glacier, an inventory of all archives in each of your vaults is maintained for disaster recovery or occasional reconciliation purposes. The vault inventory is updated approximately once a day. You can request a vault inventory as either a JSON or CSV file and will contain details about the archives within your vault including the size, creation date and the archive description (if you provided one during upload). The inventory will represent the state of the vault at the time of the most recent inventory update.
Q: Can I obtain a real time list of my vaults?
Yes, you can list your vaults stored in Amazon S3 Glacier using either the AWS Management Console or by calling the ListVaults API. As well as a list of vault names, you will also be able to see when the vault’s inventory was last updated and a summary of the vault’s contents at that time, as well as the vault’s creation date and creator.
Amazon S3 Glacier Select
Q: What is Amazon S3 Glacier Select?
Amazon S3 Glacier Select is a feature that allows you to run queries on your data stored in Amazon S3 Glacier, without the need to restore the entire object to a hotter tier like Amazon S3. With Amazon S3 Glacier Select, you can now perform filtering and basic querying using a subset of SQL directly against your data in Amazon S3 Glacier. You provide a SQL query and list of Amazon S3 Glacier objects, and Amazon S3 Glacier Select will run the query in-place and write the output results to a bucket you specify in Amazon S3.
Q: Why should I use Amazon S3 Glacier Select?
Amazon S3 Glacier Select enables you to perform analysis on your data in Amazon S3 Glacier without first staging it in a hotter storage tier like Amazon S3. This makes it cheaper, faster and easier to gather insights from your cold data in Amazon S3 Glacier. This can unlock exciting business value for your archives, opening up multiple scenarios of using Amazon S3 Glacier for Big Data, IoT, and custom analytics workloads.
Q: How does the Amazon S3 Glacier Select compare to legacy archival solutions?
Legacy archival solutions, like on-premises tape libraries, have highly restricted data retrieval throughput and rarely have idle compute capacity nearby. The problem is even worse if tapes have been sent to an off-site storage facility. Running any kind of analysis on these solutions can easily take anywhere from weeks to even months. In contrast, with Amazon S3 Glacier Select it is easy to analyze your Amazon S3 Glacier data in-place quickly and inexpensively at latencies you choose ranging from minutes to hours.
Q: What are some scenarios in which I can use Amazon S3 Glacier Select?
You can use Amazon S3 Glacier Select when you need to perform pattern matching or custom analytics on your archived data stored in S3 Glacier. Some customers occasionally face situations where they need to perform filtering on specific keys in response to an audit where they must respond in a few hours, such as a customer who might need to query all of their usage logs for the past year to respond to a billing dispute. Higher-level Big Data applications can also use the Amazon S3 Glacier Select APIs to provide Amazon S3 Glacier as an additional data source, so that customers can use their tools and languages against their S3 Glacier data.
Q: What kind of latencies can I expect when querying against Amazon S3 Glacier?
S3 Glacier provides three retrieval options - Expedited, Standard, and Bulk. All of these options provide different retrieval times and costs. Amazon S3 Glacier Select works with each of these retrieval options, allowing you to choose the option best aligned to the speed at which you want your query to return results. For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. Standard retrievals complete within 3 – 5 hours. Bulk retrievals complete within 5 – 12 hours. For more details on S3 Glacier retrievals, refer to the FAQs on S3 Glacier data retrievals.
Q: How do I get started using Amazon S3 Glacier Select?
You can create an Amazon S3 Glacier Select job using the Amazon S3 API, Amazon Glacier direct API, AWS SDK, and AWS CLI.