AWS Blog

Cloud MapReduce from Accenture

Accenture is a Global Solution Provider for AWS. As part of their plan to help their clients extend their IT provisioning capabilities into the cloud, they offer a complete Cloud Computing Suite including the Accenture Cloud Computing Accelerator, the Cloud Computing Assessment Tool, the Cloud Computing Data Processing Solution, and the Accenture Web Scaler.

Huan Liu and Dan Orban of Accenture Technology Labs sent me some information about one of their projects, Cloud MapReduce. Cloud MapReduce implements Google’s MapReduce programming model using Amazon EC2, S3, SQS, and SimpleDB as a cloud operating system.

According to the research report on Cloud MapReduce, the resulting system runs at up to 60 times the speed of Hadoop (this depends on the application and the data, of course). There’s no master node, so there’s no single point of failure or a processing bottleneck. Because it takes advantage of high level constructs in the cloud for data (S3) and state (SimpleDB) storage, along with EC2 for processing and SQS for message queuing, the implementation is two orders of magnitude simpler than Hadoop. The research report includes details on the use of each service; they’ve also published some good info about the code architecture.

Download the code, read the tutorial, and and give it a shot!