A self-funded startup with big aspirations, Hanzo has an endless task of collecting data from the web and storing it in a browsable and searchable archive. With an infinite amount of web data to store, Hanzo researched co-location and virtualization options. But many of these options did not provide the scalability Hanzo needed to achieve its goal of indexing the World Wide Web. When Amazon Web Services (AWS) announced the beta launch of Amazon Elastic Compute Cloud (Amazon EC2) in August 2006, Hanzo got excited.
Hanzo co-founder and CEO Mark Middleton recalls this moment, “We instantly recognized this idea [Amazon EC2] as a totally disruptive concept and we could see a fantastic match for our products.”
Hanzo turned to Amazon Web Services as an all inclusive, web-scale solution for hosting, data processing and storage. They are currently using Amazon EC2 to host their website Hanzoweb.com and process their busy web crawlers and Amazon Simple Storage Service (S3) to store the endless web data they collect—currently at 1TB and growing fast. Hanzo also uses Amazon Simple Queue Service (SQS) to transport data and coordinate their web crawlers, transmitting up to 100,000 requests and messages per day to sort and store archiving requests so they are never lost. These requests are processed in Amazon EC2, which helps ensure that no servers are sitting idle or doing duplicate work and also protects the validity of the data.
Since switching to AWS, Hanzo has seen about a 50% reduction in costs and lower TCO. Middleton says, “Amazon Web Services fits our model perfectly with a minimally intrusive yet powerful utility computing model. Since we switched to Amazon Web Services, we no longer need to think about hardware, racks, routers, firewalls—hardware, what’s that?”
Hanzo’s archiving requests are growing daily and Amazon Web Services is helping them scale. Since the size of each request could be as small as a 10KB image or as large as a 50-page website with links and images, Hanzo knows the resources are available when they need them. And with Amazon’s pay-as-you-go pricing, Hanzo does not have to devote large amounts of time and money to building their infrastructure, they simply pay for what they use. Amazon Web Services has enabled Hanzo’s web-scale computing business model—the ability to scale based on demand.
Achieving their original social mission to provide an archived collection of the web, Hanzo has expanded their mission to sell software and services which allows businesses to archive and version their own web content. “We plan to sell our own products using a utility billing model derived from our underlying EC2/S3 model.” Hanzo is making these services available to the general public, to corporations, and to non-profit organizations like Internet Archive. Corporations are finding value in this archiving software and utilizing the results toward business intelligence, commercial advantage and legal compliance.
Hanzo is based in based in London and Paris. Currently, they donate their web archive collections to Internet Archive, a non-profit organization dedicated to maintaining an online library of the WWW. Hanzo also provides a white label archiving solution, Forge, and an enterprise solution, Hanzo Enterprise.For more on Hanzo, go to http://www.hanzoarchives.com/ .