Listing Thumbnail

    Oxford Nanopore Technologies Benchmark Datasets

     Info
    Open data
    |
    Deployed on AWS
    The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. Genome In A Bottle reference cell lines). Raw data are provided with metadata and scripts to describe sample and data provenance.

    Overview

    The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. Genome In A Bottle reference cell lines). Raw data are provided with metadata and scripts to describe sample and data provenance.

    Features and programs

    Open Data Sponsorship Program

    This dataset is part of the Open Data Sponsorship Program, an AWS program that covers the cost of storage for publicly available high-value cloud-optimized datasets.

    Pricing

    This is a publicly available data set. No subscription is required.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    AWS Data Exchange (ADX)

    AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.

    Open data resources

    Available with or without an AWS account.

    How to use
    To access these resources, reference the Amazon Resource Name (ARN) using the AWS Command Line Interface (CLI). Learn more 
    Description
    Oxford Nanopore Open Datasets
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data
    AWS region
    eu-west-1
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://ont-open-data/
    Description
    Nanopore sequencing data of the Genome in a Bottle samples NA24385, NA24149, and NA24143 (HG002-HG004) using the LSK114 sequencing chemistry. The direct sequencer output is included, raw signal data stored in .fast5 files and basecalled data in .fastq file. Additional secondary analyses are included, notably alignments of sequence data to the reference genome and variant calls are provided along with statistics derived from these. The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: NA24385, NA24149, and NA24143.
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data/giab_lsk114_2022.12
    AWS region
    eu-west-1
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://ont-open-data/giab_lsk114_2022.12/
    Description
    Using nanopore sequencing, researchers have directly identified DNA and RNA base modifications at nucleotide resolution, including 5-methylycytosine, 5-hydroxymethylcytosine, N6-methyladenosine, 5-bromodeoxyuridine in DAN; and N6-methyladenosine in RNA, with detection of other natural or synthetic epigenetic modifications possible through training basecalling algorithms. One of the most widespread genomic modifications is 5-methylcytosine (5mC), which most frequently occurs at dinucleotides. Compared to whole-genome bisulfite sequencing, the traditional method of 5mC detection, nanopore technology can offer many advantages The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora
    AWS region
    eu-west-1
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://ont-open-data/gm24385_mod_2021.09/extra_analysis/bonito_remora/
    Description
    CpG dinucleotides frequently occur in high-density clusters called CpG islands (CGI) and >60% of human genes have their promoters embedded within CGIs. Determining the methylation status of cytosines within CpGs is of substantial biological interest: alterations in methylation patterns within promoters is associated with changes in gene expression and disease states such as cancer. Exploring methylation differences between tumour samples and normal samples can help to elucidate mechanisms associated with tumour formation and development. Nanopore sequencing enables direct detection of methylated cytosines (e.g. at CpG sites), without the need for bisulfite conversion. Oxford Nanopore’s Adaptive Sampling offers a flexible method to enrich regions of interest (e.g. CGIs) by depleting off-target regions during the sequencing run itself with no upfront sample manipulation. Here we introduce Reduced Representation Methylation Sequencing (RRMS) to target 310 Mb of the human genome including regions which are highly enriched for CpGs including ~28,000 CpG islands, ~50,600 shores and ~42,700 shelves as well as ~21,600 promoter regions.
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::ont-open-data/rrms_2022.07
    AWS region
    eu-west-1
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://ont-open-data/rrms_2022.07/

    Resources

    Support

    Managed By

    Oxford Nanopore Technologies

    How to cite

    Oxford Nanopore Technologies Benchmark Datasets was accessed on DATE from https://registry.opendata.aws/ont-open-data .

    License

    Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/  The following cell lines/DNA samples were obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research: GM24385.

    Similar products