• Amazon Web Services provides links to these packages as a convenience for our customers, but has not been reviewed or screened by AWS.
  • Please review this software to ensure it meets your needs before using it.


Common Crawl Example Library + Hadoop 0.20.205 + Amazon Elastic MR Launch Script

Submitted by:  Lisa Green
Provider:  Community
OS:  Amazon Linux
License:  Public
AMI Manifest:  837454214164/Common Crawl Quick Start AMI
Root Device Type:  ebs
Architecture:  x86_64
Listed on:  Jul 27, 2012 17:48 GMT
Updated on:  Jul 27, 2012 17:48 GMT

Click on the button corresponding to your region to launch this AMI using the EC2 Console.

This AMI allows you to quickly get up and running using the Common Crawl Amazon Public Data Set. It includes Common Crawl sample code, a local Hadoop 0.20.205 instance, and a script for launching Common Crawl analysis jobs on Amazon Elastic MapReduce. Instructions are printed to the console as soon as you log in.

Read more about the Common Crawl data set at commoncrawl.org.