Genome Aggregation Database (gnomAD)

Sold by: gnomAD Production Team at the Broad Institute

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use. The v4.1 data set (GRCh38) spans 730,947 exome sequences and 76,215 whole-genome sequences from unrelated individuals, of [diverse ancestries](https://gnomad.broadinstitute.org/stats#diversity), sequenced sequenced as part of various disease-specific and population genetic studies. The gnomAD Principal Investigators and team can be found [here](https://gnomad.broadinstitute.org/team), and the groups that have contributed data to the current release are listed [here](https://gnomad.broadinstitute.org/about). Sign up for the gnomAD mailing list [here](http://broad.io/gnomad_list).

Overview

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use. The v4.1 data set (GRCh38) spans 730,947 exome sequences and 76,215 whole-genome sequences from unrelated individuals, of diverse ancestries , sequenced sequenced as part of various disease-specific and population genetic studies. The gnomAD Principal Investigators and team can be found here , and the groups that have contributed data to the current release are listed here . Sign up for the gnomAD mailing list here .

Features and programs

Open Data Sponsorship Program

This dataset is part of the Open Data Sponsorship Program, an AWS program that covers the cost of storage for publicly available high-value cloud-optimized datasets.

Learn more

Pricing

This is a publicly available data set. No subscription is required.

How can we make this page better?

We'd like to hear your feedback and ideas on how to improve this page.

Legal

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

AWS Data Exchange (ADX)

AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.

Open data resources

Available with or without an AWS account.

How to use: To access these resources, reference the Amazon Resource Name (ARN) using the AWS Command Line Interface (CLI). Learn more

Description: gnomAD summary data aggregated from large-scale human genome and exome sequencing projects.
Resource type: S3 bucket
Amazon Resource Name (ARN): arn:aws:s3:::gnomad-public-us-east-1
AWS region: us-east-1
AWS CLI access (No AWS account required): aws s3 ls --no-sign-request s3://gnomad-public-us-east-1/

Resources

Vendor resources

View this dataset on Github

Support

Contact

gnomad@broadinstitute.org

Managed By

gnomAD Production Team at the Broad Institute

How to cite

Genome Aggregation Database (gnomAD) was accessed on DATE from https://registry.opendata.aws/broad-gnomad .

License

MIT ; terms of use

Similar products

Genome Aggregation Database (AWS Lake Formation Test Product)

By Amazon Web Services

Use Genome Aggregation Database (gnomAD) (AWS Lake Formation Test Product) to understand how to interact with data made available via AWS Lake Formation.

View product

OpenMed NER Genome Detection Tiny

By OpenMed

Open-source NER model for gene entities in biomedical and clinical text. Trained on BC2GM and optimized for state-of-the-art precision, it enables reliable extraction with fast, easy deployment via Hugging Face Transformers.

View product

OpenMed NER Genome Detection Medium

By OpenMed

View product

DRAGEN Complete Suite

By Illumina Inc.

The DRAGEN Complete Suite enables ultra-rapid analysis of Next Generation Sequencing (NGS) data for large data sets, such as whole genomes, exomes, and genes/panels.

View product

OpenMed NER Genomic Detection Large

By OpenMed

Open-source NER model for genetics entities in biomedical and clinical text. Trained on GELLUS and optimized for state-of-the-art precision, it enables reliable extraction with fast, easy deployment via Hugging Face Transformers.

View product

Genomics Acceleration, On-prem to the AWS Cloud

By Zuehlke Engineering AG

This offering enables you to shift on-prem genomics data processing to the AWS Cloud without changing existing HPC application code. This provides a scalable compute and storage solution that allows samples to be processed at higher speed and at higher frequency

View product