AWS Partner Network (APN) Blog

Cloud Deduplication, On-Demand: StorReduce, an APN Technology Partner

Develop. Disrupt. Repeat.

Our goal is to provide our APN Partners with the services, support, and resources they need to provide their end customers with innovative value-added services and solutions on the AWS platform. We love hearing stories about the unique products our APN Technology Partners have developed that integrate with the AWS platform, and today we’re going to tell you about one such product from APN Technology Partner StorReduce.

Our Partner SA team has worked closely with StorReduce, and below we discuss why the StorReduce team chose to work with AWS. We then discuss the company’s success working in working with AWS Customer and fellow APN Techology Partner SpectrumData.

Who is StorReduce?

StorReduce helps enterprises storing unstructured data to Amazon Simple Storage Service (Amazon S3) or Amazon Glacier on AWS to reduce their amount and cost of storage by as much as 50-95 percent. It also offers enterprises a new and more efficient way to migrate backup appliance data and large tape archives to AWS.

StorReduce’s deduplication software runs as an instance on the cloud or as a virtual machine in a datacenter and scales to petabytes of data. The deduplication removes any redundant blocks of data before it is stored and ensures that only one copy of each block is stored.  StorReduce provides throughput of up to 600 MB/s for both reads and writes, and on retrieval adds an additional latency of around 10ms.  StorReduce is suitable to deduplicate most data workloads, including backup, archive, data from mobiles and wearable devices where there is copying of the data, and general unstructured file data.

StorReduce has an Amazon S3 interface, so that any data it deduplicates can seamlessly be used by AWS services such as Amazon Elastic MapReduce (Amazon EMR) for data mining, and Amazon CloudSearch.

See the diagram below to get an idea for how StorReduce works:

StorReduce and AWS

StorReduce chose to work with AWS because of AWS’ extensive range of enterprise cloud services. For instance, storage services like Amazon S3 and Amazon Glacier, and the ecosystem of tools and services that integrate with them, are important for the enterprise workloads with which StorReduce works. The global AWS footprint was another important factor for StorReduce in working with AWS, along with AWS’s commitment to reduce the cost of cloud for our customers.

For the StorReduce team, AWS is a natural choice for enterprises migrating to a public or hybrid cloud environment and for high growth companies born on the cloud. StorReduce chose the Amazon S3 compatible interface because it offers a simple integration point for its customers. The Amazon S3 compatible interface allows any application that communicates with Amazon S3 to take advantage of StorReduce for deduplication without modification. This includes third party products that copy data to and from Amazon S3, as well as AWS services such as Amazon EMR and Amazon CloudSearch.

Who is SpectrumData?

SpectrumData, headquartered in Australia, operates globally and provides migration services for companies to move data from legacy backup data into the AWS Cloud as either restored data sets or as virtual tapes. The company is highly experienced in all aspects of data migration, in particular the restoration, migration, and preservation of digital assets from legacy media and backup formats, redundant, out-dated tape, and recording technologies.

Deduplication to the Cloud – The Challenge

SpectrumData needed to migrate its clients’ petabyte scale tape archives (tens of thousands of tapes) to Amazon S3 and Amazon Glacier storage solutions. To reduce the cost of storage and the bandwidth required to transfer the tape data to AWS, SpectrumData chose to deduplicate the data. Tape archives generally contain multiple copies of the same data sets, which can be reduced down to a single copy with deduplication. This has the potential to reduce the amount of data stored down to between ½ to 1/20th.

According to Guy Holmes, Director of SpectrumData, “It is difficult to migrate large tape archives to the cloud using existing on-premises deduplication offerings because they do not scale.  We can only put four tapes at a time through their hardware before we start to see a bottleneck forming. In order to upload large tape archives to the cloud in weeks not years, we need to put hundreds of tapes at a time through the hardware 24 hours per day.”

Why StorReduce

For tape migration, StorReduce’s software can be installed on-premises for a CAPEX-free, very fast migration of an enterprise’s large tape archives and backup appliance data onto the AWS Cloud. Installing StorReduce on-premises minimizes bandwidth during the transfer. See below:

After the transfer is completed, the on-premises StorReduce software can be removed and re-instated in the cloud:

The Benefits of Working with StorReduce and AWS

For SpectrumData, the global footprint of AWS made working with AWS a natural choice. The AWS footprint allows SpectrumData to store data in close proximity to its customers no matter where they are in the world. This improves performance by reducing latency and allows SpectrumData and its customers to comply with data sovereignty laws. Another reason the company decided to work with AWS is the pay-as-you-go pricing model embraced by AWS. SpectrumData pays for exactly the resources they use, and there’s no need to estimate capacity or to make an upfront investment.

After SpectrumData was introduced to StorReduce by AWS, Holmes believed that it could overcome his current challenges with their on-premises deduplication hardware.

SpectrumData conducted a proof of concept with StorReduce, which performed the same tests on the same data that they had previously performed with a leading global deduplication hardware vendor. Holmes confirmed, “We’re delighted with StorReduce’s performance. The software deduplicates 24/7 and is more scalable than the hardware appliances we tested.  These factors help us to achieve the necessary throughput for our clients. It also showed deduplication ratios trending to over 95 percent, which is equal to the leading global deduplication offerings we have tested.”

StorReduce enables SpectrumData to migrate large tape archives to AWS far more efficiently than the hardware appliances that were tested, reducing years of work to weeks.

Additional benefits:

  • StorReduce can reduce or remove CAPEX that would otherwise need to be spent on deduplication hardware.
  • With StorReduce, once the tape data has been migrated to the cloud it is seamlessly accessible by Amazon S3 API. Therefore any existing AWS services like Amazon CloudSearch and Amazon EMR can easily access that data. This configuration is challenging with on-premises deduplication offerings.
  • As the client’s data grows, StorReduce can quickly scale to meet their needs with no need to buy additional hardware.

Holmes concludes, “Working with StorReduce and AWS makes my business work.”

To learn more about how AWS can help with your storage and backup needs, visit our Storage and Backup details page:

Try StorReduce on AWS Marketplace now with one click to see how much you could save.  To learn more about how StorReduce can migrate your tape archive or backup appliance data to the AWS Cloud, click here.