Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Oxford Nanopore Technologies Benchmark Datasets

Provided by: Oxford Nanopore Technologies, part of the AWS Open Data Sponsorship Program

Oxford Nanopore Technologies Benchmark Datasets

Provided by: Oxford Nanopore Technologies, part of the AWS Open Data Sponsorship Program

This product is part of the AWS Open Data Sponsorship Program and contains data sets that are publicly available for anyone to access and use. No subscription is required. Unless specifically stated in the applicable data set documentation, data sets available through the AWS Open Data Sponsorship Program are not provided and maintained by AWS.

Description

The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. Genome In A Bottle reference cell lines). Raw data are provided with metadata and scripts to describe sample and data provenance.

License

Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/  The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.

How to cite

Oxford Nanopore Technologies Benchmark Datasets was accessed on DATE from https://registry.opendata.aws/ont-open-data .

Update frequency
Additional datasets will be added periodically. Updates and amendents will be made to existing entries when algorithmic advancements are made (e.g. improvements to basecalling algorithms).
Support information

Managed by: Oxford Nanopore Technologies

Contact: support@nanoporetech.com

General AWS Data Exchange support

Resources on AWS

Description

Oxford Nanopore Open Datasets

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::ont-open-data
AWS Region
eu-west-1

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://ont-open-data/
Description

Nanopore sequencing data of the Genome in a Bottle samples NA24385, NA24149, and NA24143 (HG002-HG004) using the LSK114 sequencing chemistry. The direct sequencer output is included, raw signal data stored in .fast5 files and basecalled data in .fastq file. Additional secondary analyses are included, notably alignments of sequence data to the reference genome and variant calls are provided along with statistics derived from these. The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: NA24385, NA24149, and NA24143.

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::ont-open-data/giab_lsk114_2022.12
AWS Region
eu-west-1

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://ont-open-data/giab_lsk114_2022.12/
Description

Using nanopore sequencing, researchers have directly identified DNA and RNA base modifications at nucleotide resolution, including 5-methylycytosine, 5-hydroxymethylcytosine, N6-methyladenosine, 5-bromodeoxyuridine in DAN; and N6-methyladenosine in RNA, with detection of other natural or synthetic epigenetic modifications possible through training basecalling algorithms. One of the most widespread genomic modifications is 5-methylcytosine (5mC), which most frequently occurs at dinucleotides. Compared to whole-genome bisulfite sequencing, the traditional method of 5mC detection, nanopore technology can offer many advantages The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora
AWS Region
eu-west-1

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora/
Description

CpG dinucleotides frequently occur in high-density clusters called CpG islands (CGI) and >60% of human genes have their promoters embedded within CGIs. Determining the methylation status of cytosines within CpGs is of substantial biological interest: alterations in methylation patterns within promoters is associated with changes in gene expression and disease states such as cancer. Exploring methylation differences between tumour samples and normal samples can help to elucidate mechanisms associated with tumour formation and development. Nanopore sequencing enables direct detection of methylated cytosines (e.g. at CpG sites), without the need for bisulfite conversion. Oxford Nanopore’s Adaptive Sampling offers a flexible method to enrich regions of interest (e.g. CGIs) by depleting off-target regions during the sequencing run itself with no upfront sample manipulation. Here we introduce Reduced Representation Methylation Sequencing (RRMS) to target 310 Mb of the human genome including regions which are highly enriched for CpGs including ~28,000 CpG islands, ~50,600 shores and ~42,700 shelves as well as ~21,600 promoter regions.

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::ont-open-data/rrms_2022.07
AWS Region
eu-west-1

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://ont-open-data/rrms_2022.07/

Usage examples