New – Range Retrieval for Amazon Glacier

Update (October 2019) – The range retrieval function described in this post is part of Amazon S3 Glacier’s InitiateJob function. You cannot use the S3 API to initiate a range retrieval on an object that has the Glacier storage class.

Amazon Glacier is designed for storing data that is infrequently accessed. Once you have stored your data, you can retrieve up to 5% of it (prorated daily) each month at no charge.

Today we are making it easier for you to remain within the 5% retrieval band by introducing Range Retrievals. You can use this new feature to fetch only data you need from a larger file or to spread the retrieval of a large archive over a longer period of time.

Range Retrieval
Glacier’s existing archive retrieval function now accepts an optional RetrievalByteRange parameter. If you don’t provide this header, Glacier will retrieve the entire archive.

If you choose to provide this parameter, it must be in the form StartByte-EndByte. The value provided for StartByte must be megabyte aligned (a multiple of 1,048,576). The value provided for EndByte + 1 must be megabyte aligned if you are retrieving data from somewhere within the archive. If you want to retrieve data from StartByte up to the end of the archive, simply specify a value that is one less than the archive size.

When you upload data to Glacier, you must also compute and supply a tree hash. Glacier checks the hash against the data to ensure that it has not been altered en route. A tree hash is generated by computing a hash for each megabyte-sized segment of the data, and then combining the hashes in tree fashion to represent ever-growing adjacent segments of the data.

If you would like to use tree hashes to confirm the integrity of the data that you download from Glacier (and you definitely should), then the range that you specify must also be tree-hash aligned. In other words, a tree hash must exist (at some level of the tree of hashes) for the exact range of bytes retrieved. If you specify such a range, Glacier will provide you with the corresponding tree hash when the retrieval job completes.

This new feature is available now and you can start using it today. The AWS SDK for Java and the AWS SDK for .Net have been updated and now include support for Range Retrievals.

For More Information
Here are some quick links that you can use to learn more about Range Retrievals in Glacier:

— Jeff;

AWS News Blog

New – Range Retrieval for Amazon Glacier

Resources

Follow

Learn

Resources

Developers

Help