
Overview
The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. Genome In A Bottle reference cell lines). Raw data are provided with metadata and scripts to describe sample and data provenance.
Features and programs
Open Data Sponsorship Program
Pricing
This is a publicly available data set. No subscription is required.
How can we make this page better?
Legal
Content disclaimer
Delivery details
AWS Data Exchange (ADX)
AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.
Open data resources
Available with or without an AWS account.
- How to use
- To access these resources, reference the Amazon Resource Name (ARN) using the AWS Command Line Interface (CLI). Learn more
- Description
- Oxford Nanopore Open Datasets
- Resource type
- S3 bucket
- Amazon Resource Name (ARN)
- arn:aws:s3:::ont-open-data
- AWS region
- eu-west-1
- AWS CLI access (No AWS account required)
- aws s3 ls --no-sign-request s3://ont-open-data/
- Description
- Nanopore sequencing data of the Genome in a Bottle samples NA24385, NA24149, and NA24143 (HG002-HG004) using the LSK114 sequencing chemistry. The direct sequencer output is included, raw signal data stored in .fast5 files and basecalled data in .fastq file. Additional secondary analyses are included, notably alignments of sequence data to the reference genome and variant calls are provided along with statistics derived from these. The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: NA24385, NA24149, and NA24143.
- Resource type
- S3 bucket
- Amazon Resource Name (ARN)
- arn:aws:s3:::ont-open-data/giab_lsk114_2022.12
- AWS region
- eu-west-1
- AWS CLI access (No AWS account required)
- aws s3 ls --no-sign-request s3://ont-open-data/giab_lsk114_2022.12/
- Description
- Using nanopore sequencing, researchers have directly identified DNA and RNA base modifications at nucleotide resolution, including 5-methylycytosine, 5-hydroxymethylcytosine, N6-methyladenosine, 5-bromodeoxyuridine in DAN; and N6-methyladenosine in RNA, with detection of other natural or synthetic epigenetic modifications possible through training basecalling algorithms. One of the most widespread genomic modifications is 5-methylcytosine (5mC), which most frequently occurs at dinucleotides. Compared to whole-genome bisulfite sequencing, the traditional method of 5mC detection, nanopore technology can offer many advantages The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.
- Resource type
- S3 bucket
- Amazon Resource Name (ARN)
- arn:aws:s3:::ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora
- AWS region
- eu-west-1
- AWS CLI access (No AWS account required)
- aws s3 ls --no-sign-request s3://ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora/
- Description
- CpG dinucleotides frequently occur in high-density clusters called CpG islands (CGI) and >60% of human genes have their promoters embedded within CGIs. Determining the methylation status of cytosines within CpGs is of substantial biological interest: alterations in methylation patterns within promoters is associated with changes in gene expression and disease states such as cancer. Exploring methylation differences between tumour samples and normal samples can help to elucidate mechanisms associated with tumour formation and development. Nanopore sequencing enables direct detection of methylated cytosines (e.g. at CpG sites), without the need for bisulfite conversion. Oxford Nanopore’s Adaptive Sampling offers a flexible method to enrich regions of interest (e.g. CGIs) by depleting off-target regions during the sequencing run itself with no upfront sample manipulation. Here we introduce Reduced Representation Methylation Sequencing (RRMS) to target 310 Mb of the human genome including regions which are highly enriched for CpGs including ~28,000 CpG islands, ~50,600 shores and ~42,700 shelves as well as ~21,600 promoter regions.
- Resource type
- S3 bucket
- Amazon Resource Name (ARN)
- arn:aws:s3:::ont-open-data/rrms_2022.07
- AWS region
- eu-west-1
- AWS CLI access (No AWS account required)
- aws s3 ls --no-sign-request s3://ont-open-data/rrms_2022.07/
Resources
Vendor resources
Support
Contact
Managed By
Oxford Nanopore Technologies
How to cite
Oxford Nanopore Technologies Benchmark Datasets was accessed on DATE from https://registry.opendata.aws/ont-open-data .
License
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/ The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.
Similar products
![LongBench - cross-platform reference dataset profiling cancer cell [...]](https://d1ewbp317vsrbd.cloudfront.net/1f4f8f04-114c-4f26-b51a-8ce4f013fc85.png)



