AWS Official Blog

Amazon Glacier: Archival Storage for One Penny Per GB Per Month

by Jeff Barr | on | in Amazon S3 |

You Need Glacier
Im going to bet that you (or your organization) spend a lot of time and a lot of money archiving mission-critical data. No matter whether youre currently using disk, optical media or tape-based storage, its probably a more complicated and expensive process than youd like which has you spending time maintaining hardware, planning capacity, negotiating with vendors and managing facilities.

True?

If so, then you are going to find our newest service, Amazon Glacier, very interesting. With Glacier, you can store any amount of data with high durability at a cost that will allow you to get rid of your tape libraries and robots and all the operational complexity and overhead that have been part and parcel of data archiving for decades.

Glacier provides at a cost as low as $0.01 (one US penny, one one-hundredth of a dollar) per Gigabyte, per month extremely low cost archive storage. You can store a little bit, or you can store a lot (Terabytes, Petabytes, and beyond). There’s no upfront fee and you pay only for the storage that you use. You don’t have to worry about capacity planning and you will never run out of storage space. Glacier removes the problems associated with under or over-provisioning archival storage, maintaining geographically distinct facilities and verifying hardware or data integrity, irrespective of the length of your retention periods. 

Tell me More
We introduced Amazon S3 in March of 2006. S3 growth over the past 6+ years has been strong and steady, and it now stores over one trillion objects. Glacier builds on S3’s reputation for durability and dependability with a new access model that was designed to be able to allow us to offer archival storage to you at an extremely low cost.

To store data in Glacier, you start by creating a named vault. You can have up to 1000 vaults per region in your AWS account. Once you have created the vault, you simply upload your data (an archive in Glacier terminology). Each archive can contain up to 40 Terabytes of data and you can use multipart uploading or AWS Import/Export to optimize the upload process. Glacier will encrypt your data using AES-256 and will store it durably in an immutable form. Glacier will acknowledge your storage request as soon as your data has been stored in multiple facilities.

Console-1

Creating a vault in Amazon Glacier.

Glacier will store your data with high durability (the service is designed to provide average annual durability of 99.999999999% per archive). Behind the scenes, Glacier performs systematic data integrity checks and heals itself as necessary with no intervention on your part. There’s plenty of redundancy and Glacier can sustain the concurrent loss of data in two facilities.

At this point you may be thinking that this sounds just like Amazon S3, but Amazon Glacier differs from S3 in two crucial ways.

First, S3 is optimized for rapid retrieval (generally tens to hundreds of milliseconds per request). Glacier is not (we didn’t call it Glacier for nothing). With Glacier, your retrieval requests are queued up and honored at a somewhat leisurely pace. Your archive will be available for downloading in 3 to 5 hours.

Each retrieval request that you make to Glacier is a called a job. You can poll Glacier to see if your data is available, or you can ask it to send a notification to the Amazon SNS topic of your choice when the data is available. You can then access the data via HTTP GET requests, including byte range requests. The data will remain available to you for 24 hours.

Retrieval requests are priced differently, too.  You can retrieve up to 5% of your average monthly storage, pro-rated daily, for free each month. Beyond that, you are charged a retrieval fee starting at $0.01 per Gigabyte (see the pricing page for details).  So for data that youll need to retrieve in greater volume more frequently, S3 may be a more cost-effective service. 

Console-2
Notifications for retrieval jobs.

Secondly, S3 allows you to assign the name of your choice to each object. In order to keep costs as low as possible, Glacier will assign a unique id to each of your archives at upload time.

Glacier In Action
I’m sure that you already have some uses in mind for Glacier. If not, here are some to get you started:

  • If you are part of an enterprise IT department, you can store email, corporate file shares, legal records, and business documents. The kind of stuff that you need to keep around for years or decades with little or no reason to access it.
  • If you work in digital media, you can archive your books, movies, images, music, news footage, and so forth. These assets can easily grow to tens of Petabytes and are generally accessed very infrequently.
  • If you generate and collect scientific or research data, you can store it in Glacier just in case you need to get it back later.

Get Started Now
Glacier is available for use today in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), Asia Pacific (Tokyo) and EU-West (Ireland) Regions.

Watch our new video to see how to get started:

You can access Glacier from the AWS Management Console or through the Glacier APIs. We have added Glacier support to the AWS SDKs and there’s also plenty of Glacier documentation.

If you’d like to know even more about Glacier, please join us for an online seminar on September 19th.

And there you have it. What do you think?

 — Jeff;

PS. If you are an engineer or engineering manager with an interest in massive scale distributed storage systems wed love to hear from you. Please send your resume to glacier-jobs@amazon.com.