1000 Genomes Project

The 1000 Genomes Project, initiated in 2008, is an international public-private consortium that aims to build the most detailed map of human genetic variation available.


Size: 200 TB
Source: National Center for Biotechnology Information (NCBI)
Created On: October 17, 2010 9:59 PM GMT
Last Updated: July 18, 2012 4:34 PM GMT
This the new full data set for 1000 Genomes project is now available on S3 at http://s3.amazonaws.com/1000genomes

The 1000 Genomes Project aims to build the most detailed map of human genetic variation, ultimately with data from the genomes of over 2,600 people from 26 populations around the world. The data contained within this release include results from sequencing the DNA of approximately first 1,700 of over 2,600 people; the remaining samples are expected to be sequenced in 2012 and the data will be released to researchers as soon as possible. The data presented here, over 200Tb, is intended for use in analysis on Amazon EC2 or Elastic MapReduce, rather than for download.

You can find more information about working with 1000 Genomes data on AWS at http://aws.amazon.com/1000genomes

