AWS Storage Blog

Webair addresses unique backup needs with AWS Storage Gateway and Amazon S3 Glacier Deep Archive

Webair is a managed cloud and infrastructure solutions provider, specializing in cloud-based solutions such as Hosted Private Cloud, Hybrid Cloud, Disaster Recovery-as-a-Service, and Backups-as-a-Service. As a service provider, our customers task us with safeguarding over 30 PB of sensitive data and 10,000 servers, which inherently poses unique challenges. Not only do Webair’s backup services protect data found on Virtual Machines, physical servers, and databases, but they also protect entire repositories of unstructured file data. Protecting and backing up data is one challenge, but what happens when customers are looking for both immediate restore capabilities and long-term archival backups for legal and compliance assurances?

In this blog post, we’ll take a deep dive into how Webair leveraged AWS Storage Gateway’s Tape Gateway and Amazon S3 Glacier Deep Archive to cost-effectively protect massive quantities of data while meeting the unique requirements of one of our customers. These customer requirements include:

  • Keeping multiple offsite backups on diverse sets of media to emulate the air-gapped assurances of physical tape
  • Complying with global data sovereignty requirements
  • Ensuring all copies of the data were kept within the same country
  • Adhering to security standards and frameworks specific to the finance industry
  • Ensuring no 3rd parties would have access to their data

Satisfying unique backup requirements

Webair currently leverages Veeam Backup and Replication as one of its data movers with regard to our Backups-as-a-Service, which was the data mover selected for this particular use case. After ingesting the data from the customer, we needed to find a way to get that data offsite and onto different mediums to meet the previously mentioned needs of the customer. We wanted to use Amazon S3 Glacier Deep Archive for various reasons.

When it came to making a decision, our main concern was scalability. Leveraging AWS addressed that concern, as AWS allowed us to spin Tape Gateways up or down or remove them on demand. A second reason we decided to go with AWS was the cost. The attractive pricing of S3 Glacier Deep Archive allowed us to offer this customer an excellent solution to their tertiary offsite needs while staying within the customer’s budget. Another factor that led to our decision to leverage this platform is that AWS had a presence in all of the same geographic regions where the data needed to be sovereign. We needed a way to get that data from Webair’s own data centers directly into S3 Glacier Deep Archive. We found that we could leverage Tape Gateway to present virtual tape libraries to Veeam and perform Tenant-to-Tape backup jobs immediately following the data landing on our bare metal backup repositories. This enabled Webair to keep the primary offsite backups for the customer in our facilities. This also allowed a second copy of the data to be stored on the “offline” virtual tapes, in the same countries, and with unique encryption keys. This not only met the customer’s compliance requirements, but it exceeded them.

This method was an advantageous solution for a number of reasons. The first reason is that the solution is fully virtualized, straightforward, and scalable, simply enabling us to spin up the necessary appliances in the necessary AWS Regions from a single pane-of-glass. This not only lends itself to ease of deployment but also to the ease of administration. These benefits are critical because the technology employed for backing up customer data, especially with regard to long-term archival data, must be straightforward and scalable. The monitoring tools available in the AWS Management Console were another great benefit, and we found them to be ideal for long-term monitoring and management by our dedicated backup and recovery teams.

Over-engineering a solution is always a recipe for disaster, which was one of our main concerns when making a decision on the solution. Using Tape Gateway helped us avoid the involvement of another data mover, allowing us to ensure that we move data to the tertiary and offline location as soon as it hits our infrastructure, all from within Veeam. Secondly, Tape Gateway allows us to archive virtual tapes straight into S3 Glacier Deep Archive once they are ejected from the virtual tape drive and exported from the virtual tape library. By configuring our jobs appropriately we are able to ensure that after tenant backups are completed, the virtual tapes are then ejected and archived into the cost-effective S3 Glacier Deep Archive storage class. This also greatly reduces the amount of hands-on time needed by systems administrators who are responsible for physically loading, unloading, cataloging, and moving or shipping tapes offsite for these systems. With virtual tapes, this is no longer a concern, and it frees up personnel for other IT needs.

One resource that we found useful in our deployment was a Veeam whitepaper. It is essentially a walkthrough of the deployment process for the needed components.

Multi-phase backups deployment

Our deployment was broken up into several phases. First, we determined the necessary components and specifications and deployed them. Next, we worked on how to automate as much of the process as possible, specifically, with regard to the creation and importing of virtual tapes. Following that, we configured tenant backup to tape jobs within Veeam, which leveraged the Tape Gateway virtual tape libraries. After that, we had to ensure these jobs and the data were secure, and finally, we enabled monitoring using Amazon CloudWatch dashboards so that we have visibility into AWS.

Sizing and deploying Tape Gateway

By using a simple formula provided in the previously mentioned whitepaper to determine the size of the cache disk and upload buffer, we were able to estimate the required storage. Using VMware, it was a simple Open Virtualization Format (OVF) deployment using the Open Virtualization Appliance (OVA) provided by AWS, and we added in the necessary disk sizes. Once deployed in our local VMware environment, we activated Tape Gateway via the AWS Storage Gateway console.

Formula used to determine the upload buffer according to the previously mentioned whitepaper.

Formula used to determine the upload buffer according to the previously mentioned whitepaper. The cache disk size will always be 1.1 times the size of the upload buffer discovered in the preceding formula.

Deployment of a Windows VM that acted as Veeam Tape Server

We deployed a virtual Windows Server, and from there, we were able to present Tape Gateway to the Windows OS via iSCSI. This VM was then added to the Veeam backup infrastructure as a tape server.

Automation of tape creation and importing of tapes

No one wants to sit around changing tapes all day, even if they are virtual. A useful feature that AWS recently released for Tape Gateway is automatic tape creation, which enables customers to create tapes automatically when the number of available tapes reaches a certain threshold.

Automation of tape creation and importing of tapes

We found that combining the above feature with scheduled tasks in Windows on our Veeam Backup servers, ensured that tapes were always available. We authored a PowerShell script that checks if there are new tapes to import. The script then moves the imported tapes to the free media pool where the tenant media pools can consume them.

Configuring Tenant-to-Tape backup jobs

Once we deployed the components, the creation of Tenant-to-Tape jobs works the same as any other tape or backup job would from within Veeam. There isn’t anything new to learn for the admin who would be responsible for administering their Veeam environment. This is ideal because large organizations may decide to have a dedicated infrastructure team handle the deployment of these components and then turn the keys over to a backup administrator. The administrator can then simply point and click the tape server to add it into the Veeam infrastructure and start creating jobs.

AWS Virtual Tape Library shows up in Veeam just like any other physical tape library would

The Tape Gateway virtual tape library shows up in Veeam just like any other physical tape library would, with the virtual drives emulated as IBM tape drives.

Tenant-to-Tape jobs inside of Veeam targeting AWS.

Tenant-to-Tape jobs inside of Veeam targeting AWS. Although each job has only one “object,” each job is actually sending TBs in total of data for the corresponding tenant, made up of many jobs and VMs.

Security

Once the Veeam Tenant-to-Tape job finishes, we configured the job to then eject and export the virtual tape. This archives the virtual tape in Amazon S3 Glacier Deep Archive and takes the tape “offline,” which prevents data exposure, and protects against threats like ransomware. When the Tenant-to-Tape jobs are created on the Webair side, a unique encryption key is created to further protect customer data. Webair can remount the tapes and recover from them, but that requires the second encryption password, providing yet another security layer. Imagine if an attacker managed to obtain the encryption key for the primary tenant backup copy jobs. They would be out of luck trying to restore from these virtual tapes because of the second encryption key, securely held on the Webair side. Veeam Backup and Replication Version 11 includes an Archive Tier feature to tier older data to S3 Glacier Deep Archive. However, we felt that leveraging virtual tapes with Tape Gateway addressed the preceding concern, as the file format, encryption, technology, and cloud storage are entirely different, and because they are stored offline and disconnected from the Veeam infrastructure. This improves on the 3-2-1 backup rule, and could even be thought of as a 4-3-2-1 backup rule.

Monitoring and alarms

Teams need to know if backups aren’t running, and they must have the ability to see various trends, such as data transfer rates, or pinpoint any possible issues. Using Amazon CloudWatch helped us address these needs. We were able to configure alarms to tell us if the buffer or cache disks were over a certain threshold. This would create an email that would hit our ticketing system for our Network Operations Center (NOC). This is a powerful tool on top of our already existing Veeam ONE alarms. By combining both the Amazon CloudWatch alarms and Veeam ONE alarms, we have the ability to set up very detailed alerts. For instance, we can set up an alert if a Tape Gateway has a buffer disk that is close to a utilization threshold. We can also set up alerts to see if a Tenant-to-Tape job that is moving TBs of data is running too long, or if a job has failed. For monitoring, we were able to use dashboards in CloudWatch to set up a number of widgets (shown in the following screenshot). These widgets give administrators in our NOC information about our Tape Gateway infrastructure at a glance.

A CloudWatch Dashboard Widget showing us the total TB of uploaded data from each of our Tape Gateways to AWS over the past 3 Months.

A CloudWatch Dashboard widget showing us the total TB of uploaded data from each of our Tape Gateways to AWS over the past 3 months.

Conclusion

Tape Gateway and virtual tape libraries offer a large amount of flexibility for the purposes of backups, giving those who leverage the service a cost-effective, long-term backup solution with incredible monitoring and alerting capabilities. For this particular use case, we found that it allowed Webair to satisfy our customer’s unique needs regarding security and tertiary backups from both a backup requirement and financial standpoint. The ease of deployment and the scalability of Tape Gateway made it the perfect choice for handling our customer’s requirement to have another set of secure offsite backups outside of our data center. It also allowed us to scale up or down at will to meet demand. Not only did these tools help ensure the customer’s requirements were fully met, but they also allowed Webair to streamline the backup process and increase operational efficiencies internally for this particular use case.

Thanks for reading this post. If you have any feedback or questions, please leave them in the comments section.

Gregory Barney

Gregory Barney

With nearly two decades of experience in the Information Technology space, Greg Barney is the Lead Backups Engineer at Webair. As Lead Backups Engineer, Greg’s primary focus is on research and development of Backups-as-a-Service and Disaster Recovery-as-a-Service offerings.

Sagi Brody

Sagi Brody

Serving as Chief Technology Officer at Webair for over 20 years, Sagi Brody is responsible for all technical infrastructure, design, and operations for Webair. Under Sagi’s leadership, the company has evolved from a web host into one of the top 100 high-touch, agile cloud and fully managed infrastructure service providers in the world.