Today, Etsy uses Amazon Elastic MapReduce for web log analysis and recommendation algorithms. Because AWS easily and economically processes enormous amounts of data, it’s ideal for the type of processing that Etsy performs. Etsy copies its HTTP server logs every hour to Amazon S3, and syncs snapshots of the production database on a nightly basis. The combination of Amazon’s products and Etsy’s syncing/storage operation provides substantial benefits for Etsy. As Dr. Jason Davis, lead scientist at Etsy, explains, “the computing power available with [Amazon Elastic MapReduce] allows us to run these operations over dozens or even hundreds of machines without the need for owning the hardware.”
Dr. Davis goes on to say, “Amazon Elastic MapReduce enables us to focus on developing our Hadoop-based analysis stack without worrying about the underlying infrastructure. As our cycles shift between development and research, our software and analysis requirements change and expand constantly, and [Amazon Elastic MapReduce] effectively eliminates half of our scaling issues, allowing us to focus on what is most important.”
Etsy has realized improved results and performance by architecting their application for the cloud, with robustness and fault tolerance in mind, while providing a market for users to buy and sell handmade items online.
To learn more, visit http://www.etsy.com/ .