The YRI Trio Dataset provides complete genome sequence data for three Yoruba individuals from Ibadan, Nigeria, which represent the first human genomes sequenced using Illumina's next generation Sequence-by-Synthesis technology. For each genome, the dataset contains >30x average depth of paired 35-base reads.
This data set can be used for the following applications:
- The development of alignment algorithms
- The development of de novo assembly algorithms
- The development of algorithms that define genetic regions of interest, sequence motifs, structural variants, copy number variations, and site-specific polymorphisms
- To test the viability of annotation engines that start with raw sequence data