AWS Storage Blog

AWS and Tape Ark partner to migrate petabytes of tape data using AWS Snowball with Tape Gateway

The last century has seen some remarkable innovation that has created new possibilities in a wide range of computing areas. Most remarkable of all is that companies around the world can now have all of their data in one place, completely accessible, all of the time. Amazon S3 and all of the S3 storage classes provide this capability.

Before the cloud, most corporations worked with two types of data. Active data and backup data. Active data was typically stored online on spinning disks. Back-up data was often stored as physical tapes. Companies chose to store their data in these formats because spinning disks were the best online option for performance, and physical tapes were a more economical option for back-up data. Physical tapes were good as a portable offline format and could easily be stored at offsite locations, but the data became difficult to access as time passed and as tape media degraded over time.

Technology continues to evolve at an unprecedented pace. New capabilities arise every year helping companies streamline IT operations across the globe with cloud storage services being one of the foundational capabilities amongst them. Large businesses consider data as an intellectual property asset, and assign it an economic value in their annual reports alongside physical assets such as property, plant, and equipment. Due to this, there is now a trend where companies are moving their long-term physical tape archives off of traditional tape media such as Linear Tape-Open (LTO), and onto reliable digital platforms. This allows the data to be quickly and accurately recalled when it is needed.

Cloud storage services, including the Amazon S3 Glacier storage classes, have higher durability ratings for data when compared to traditional storage mediums including tape media. The cost of S3 Glacier is attractive at as little as $1 USD per terabyte per month, and S3’s 11 9s of durability assures that requested data is accessible and readable when needed. To leverage these benefits and make the transition to the cloud, storage specialists and IT leads are being asked to bring back their historical data collections from offsite storage and migrate the data to durable and easily available virtual tapes. For many, the only way to get the maximum benefit from legacy tape archives is to liberate the physical tapes and convert them to virtual tapes accessible in the cloud. In this blog post, I discuss how you can reduce your physical tape data storage and migrate petabytes of data to S3 Glacier Flexible Retrieval (formerly S3 Glacier) or S3 Glacier Deep Archive so that you can access your tape data easily.

Common challenges of migrating physical tape data to the cloud

Today, many organizations have some form of historical data stored on physical tapes. These tape collections have accumulated into the hundreds of millions of tapes worldwide, with most stored offsite making data access both time-consuming and complicated. In many cases, the physical tape collections are so immense that it is impractical for in-house IT departments to ingest data for analysis when needed, and companies often find it difficult to determine what is actually on the historical media.

IT departments that need to ingest thousands, or even tens-of-thousands of tapes into AWS encounter numerous issues that makes the liberation of tape data challenging. Here are four common challenges:

  • First, organizations face internal capacity limitations especially when running today’s backups while simultaneously reading-in previous years of backup data. This task is often complex and creates resource conflicts.
  • Second, historical tapes are often on end-of-life media types where you no longer have the tape drives to read them.
  • Third, historical tapes are often in data formats that are no longer supported by your architecture.
  • Finally, your internet bandwidth for uploading petabyte-scale legacy tape data to AWS is simply not available.

With all of these challenges, it is no wonder companies choose not to migrate their legacy or contemporary tape data to the cloud, and instead choose the age old “do nothing” strategy which still leaves you unable to access and fully utilize your existing data assets.

How Tape Ark and AWS Snowball accelerate tape data migration to AWS

At Tape Ark, we partnered with AWS Snow Family to solve these common challenges so that you can migrate your tape data to the cloud. Tape Ark specializes in migrating physical tape data to AWS. Together, we work to liberate your historical data easily so that you can focus on your daily business tasks without distraction or bandwidth constraints. We firmly believe that some of the most significant discoveries in the next century will be done using liberated tape data collections amassed over the last 50 years.

Today, Tape Ark achieves petabyte scale tape migration to AWS in two primary approaches. The first method is through the use of AWS Storage Gateway’s Tape Gateway, and our use of high-bandwidth network connectivity to AWS. This allows Tape Ark to copy tape data directly from on-premises tape drives to Tape Gateway at an unprecedented speed and scale. Each Tape Gateway has up to 1,500 virtual tape slots, and essentially an unlimited archiving capability when exporting tapes to S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive.  Each gateway can copy up to 15 TiB of data per day over a capable network.

Our second approach is through the use of the new AWS Snowball with Tape Gateway. This allows Tape Ark to copy physical tape data directly to an AWS Snowball Edge Storage Optimized device and migrate up to 80 TB of tape data to your desired Amazon Region where data is securely stored in S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive without using a high bandwidth connection to AWS. With your data in S3 Glacier storage classes, you can initiate retrieval of your archived tapes right through the AWS Storage Gateway management console and manage your virtual tapes through your backup application. You can manage access to your virtual tape library and enforce retention policies as you would with the offsite storage of physical tapes, and even convert the tapes to be immutable with WORM protection and Tape Retention Lock.

Snowball-with-Tape-Gateway-How-it-Works diagram

Figure 1: How AWS Snowball with Tape Gateway works

As beta-testing partners, my team at Tape Ark and I can say that the performance and ease of use of AWS Snowball with Tape Gateway really caught our attention. I found that the Snowball devices perform extremely well. The security, durability, and offline transport capabilities makes the Snowball Edge tape data devices a real game changer when it comes to migrating data to the cloud.

The main benefit of the AWS Snowball with Tape Gateway is that any organization who wants to transfer data from ailing physical tapes can migrate data offline and quickly remove the risks associated with physical tapes. This includes migrating data for backups, archive, or long-term-retention modality.

AWS Snowball with Tape Gateway adds an offline tape ingest process for Tape Ark, as some customers cannot ship their physical tapes to a Tape Ark ingest facility due to compliance or regulatory reasons.  In these situations, the tapes need to be ingested at your site. Tape Ark can ship the AWS Snowball with Tape Gateway device to your desired site, perform the ingest, and then ship the device to your desired AWS Region. Through the use of  AWS Snowball with Tape Gateway, we are able to scale the ingest process without the need to introduce high bandwidth networking. Without this device, the ingest process often takes weeks or months, competing for the available network bandwidth. The data you migrate with Snowball Edge is encrypted and secure when in transit, and after your data is ingested into S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, the Snowball Edge device is securely erased. You can seamlessly access and manage your virtual tapes using the AWS Storage Gateway management console.

Conclusion

In this post, I discussed that managing physical tapes presents challenges beyond the cost of maintaining warehouses. You have limited access to your data, and are often working just to find ways to maintain outdated tapes – sometimes you may not even be sure what data is on those physical tapes. This leads to undesirable costs, errors, and insight limitations when the need arises.

Tape Ark is pleased to add the AWS Snowball with Tape Gateway service to its toolkit so that, together, AWS and Tape Ark can help you remove the risk and cost associated with managing physical tapes. Review some of these suggested resources to learn more about Tape Ark and AWS Snowball with Tape Gateway:

Begin your migration to the cloud by cloning or converting your physical tapes with AWS Snowball with Tape Gateway, and securely store your virtual tapes in S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive. Contact us today to get started, or leave comments below that the AWS team can help address.

Guy Holmes

Guy Holmes

Guy has spent 20 years working in the tape and data storage industry. In the last 5 years, Guy founded Tape Ark where he has been a maverick for change; challenging the status quo to migrate legacy data to the cloud so that cloud-enabled technologies can be applied to foster actionable innovation and drive commercial breakthroughs. Guy is a strong advocate for data liberation and the role that historical data will play in making profound discoveries in the future. Guy has a degree in Physics, an MBA in Technology Management, and maintains memberships with PPDM, ASEG, PESA & the Australian Institute of Company Directors. He is regularly a guest speaker for various global industry conferences and writes for the IQ Magazine and previously had a regular column in the Australian Society of Exploration Geophysics Preview Magazine. Guy is a father to five kids and has been married for 30 years. Outside of work he enjoys adventures in mountaineering.