Accelerate genomics discoveries
With AWS, genomics customers can dedicate more time and resources to science, speeding time to insights, achieving breakthrough research faster, and bringing lifesaving products to market.
AWS enables customers to innovate by making genomics data more accessible and useful. AWS delivers the breadth and depth of services to reduce the time between sequencing and interpretation, with secure and frictionless collaboration capabilities across multi-modal datasets. Plus, you can choose the right tool for the job to get the best cost and performance at a global scale— accelerating the modern study of genomics.
100 Gbps
2x
24
90%
100x
200+
Benefits
Accelerate time to discovery
Powerful compute and machine learning options ensure scientists can execute workloads fast and with control. AWS offers the broadest selection of compute services—more than any other cloud provider—and only AWS offers compute instances that deliver 100 Gbps of networking throughput.
Keep costs low and performance high
The flexible pricing and on-demand nature of cloud computing allows researchers to tackle complex genomics projects, without having to pay for idle infrastructure or scramble to increase cores during spiky workloads. AWS provides pay-as-you-go pricing and virtually unlimited compute capacity.
Secure collaboration on a global scale
The global footprint of AWS Regions and network match the global nature of science, with security and access controls that allow genomics researchers to manage data sharing. AWS is the most secure, extensive, and reliable global cloud infrastructure. AWS Open Data Program houses openly available, with 40+ open Life Sciences and Genomics datasets to enable frictionless collaboration, providing research and clinical communities with a single documented source of truth.
AWS Partners in Genomics
AWS offers the largest network of Partners, along with flexible workflow choice, and options for fully managed solutions to help you get to genomics insights faster.
Use cases
-
Data transfer & storage
-
Workflow automation
-
Data aggregation
-
Tertiary analysis
-
Clinical applications
-
Open datasets
-
Cost optimization
-
Data transfer & storage
-
Data transfer and storage
The volume of genomics data poses challenges for transferring it from sequencers in a quick and controlled fashion, then finding storage resources that can accommodate the scale and performance at a price that is not cost prohibitive. AWS enables researchers to manage large-scale data that has outpaced the capacity of on-premises infrastructure. By transferring data to the AWS Cloud, organizations can take advantage of high-throughput data ingestion, cost-effective storage options, secure access, and efficient searching to propel genomics research forward. See related solutions »
Related products
Customer references
Related resources
Blog: Using Amazon FSx for Lustre for Genomics Workflows on AWS
Read the blog »Blog: Genomic data compression storage and access
Read the blog »Video: Building genomics data workflows with AWS Storage Gateway
Watch the video » -
Workflow automation
-
Workflow automation for secondary analysis
Genomics organizations can struggle with tracking the origins of data when performing secondary analyses and running reproducible and scalable workflows while minimizing IT overhead. AWS offers services for scalable, cost-effective data analysis and simplified orchestration for running and automating parallelizable workflows. Options for automating workflows enable reproducible research or clinical applications, while AWS native, partner (NVIDIA and DRAGEN), and open source solutions (Cromwell and Nextflow) provide flexible options for workflow orchestrators to help scale data analysis. See related solutions »
Related products
Customer references
Related resources
Genomics Secondary Analysis Solution
Read more »Nextflow on AWS Quick Start
Read the guide »Using Cromwell with AWS Batch
Read the blog »Illumina DRAGEN on AWS
Read the guide » -
Data aggregation
-
Data aggregation and governance
Successful genomics research and interpretation often depend on multiple, diverse, multi-modal datasets from large populations. AWS enables organizations to harmonize multi-omic datasets and govern robust data access controls and permissions across a global infrastructure to maintain data integrity as research involves more collaborators and stakeholders. AWS simplifies the ability to store, query, and analyze genomics data, and link with clinical information. See related solutions »
Related products
Customer references
Related resources
Genomics Tertiary Analysis and Data Lakes Solution
Read more »Hail on AWS Quick Start
Read the guide » -
Tertiary analysis
-
Interpretation and deep learning for tertiary analysis
Analysis requires integrated multi-modal datasets and knowledge bases, intensive computational power, big data analytics, and machine learning at scale, which, historically can take weeks or months, delaying time to insights. AWS accelerates analysis of big genomics data by leveraging machine learning and high-performance computing. With AWS, researchers have access to greater computing efficiencies at scale, reproducible data processing, data integration capabilities to pull in multi-modal datasets, and public data for clinical annotation—all within a compliance-ready environment. See related solutions »
Related products
Customer references
Related resources
Genomics Tertiary Analysis and Data Lakes Solution
Read more »Genomics Tertiary Analysis and Machine Learning Solution
Read more »Blog: Building scalable image processing pipeline for image-based transcriptomics
Read the blog » -
Clinical applications
-
Clinical applications
There are several hinderances that impede scale and adoption of genomics for clinical applications including speed of analysis, managing protected health information (PHI), and providing reproducible and interpretable results. By leveraging the capabilities of the AWS Cloud, organizations can establish a differentiated capability in genomics to advance their applications in precision medicine and patient practice. AWS services enable the use of genomics in the clinic by providing the data capture, compute, and storage capabilities needed to empower the modernized clinical lab to decrease the time to results, all while adhering to the most stringent patient privacy regulations. See related solutions »
Related products
Customer references
Related resources
Processing and securing clinical genomic data
Watch the video »Improving outcomes with computational genomics
Read more »Genomics diagnostics and discovery at scale
Read more » -
Open datasets
-
Open datasets
As more life science researchers move to the cloud and develop cloud-native workflows, they bring reference datasets with them, often in their own personal buckets, leading to duplication, silos, and poor version documentation of commonly used datasets. The AWS Open Data Program (ODP) helps democratize data access by making it readily available in Amazon S3, providing the research community with a single documented source of truth. This increases study reproducibility, stimulates community collaboration, and reduces data duplication. The ODP also covers the cost of Amazon S3 storage, egress, and cross-region transfer for accepted datasets.
Related products
Customer references
Related resources
Explore all available genomics open data sets
Explore the dataset »Broad Institute gnomAD data on AWS
Read the blog »NIH STRIDES Initiative supported by AWS
Read the blog » -
Cost optimization
-
Cost optimization
Researchers utilize massive genomics datasets that require large-scale storage options and powerful computational processing, which can be cost prohibitive. AWS presents cost-saving opportunities for genomics researchers across the data lifecycle—from storage to interpretation. AWS infrastructure and data services enable organizations to save time, money, and devote more resources to science.
Related products
Customer references
Related resources
Blog: Saving Koalas using genomics research and cloud computing
Read the blog »Amazon EC2 Spot Instances
Read more »Blog: Amazon S3 storage classes
Read the blog »
Compliance in Genomics
With AWS, researchers have access to greater computing efficiencies at scale, reproducible data processing, data integration capabilities to pull in multi-modal datasets, and public data for clinical annotation—all within a compliance-ready environment.
Case studies & resources
View all Genomics customer case studies and related resources.

Fred Hutch Microbiome Research Initiative case study
Fred Hutch is focused on making therapeutic cancer drugs more effective, through translating gigabytes of raw genome datasets into insights. Using Nextflow, researchers orchestrate AWS Batch processes and analyze them with Amazon EC2 Spot Instances—saving money and providing more time for analysis.

Baylor College of Medicine case study
Baylor College of Medicine works to identify genes that contribute to aging and heart disease. Baylor partnered with DNAnexus for a solution built entirely on AWS, the DNAnexus PaaS, that uses Amazon S3 and Amazon Glacier to store Baylor’s more than 1 PB of genomics data.

Icahn School of Medicine at Mount Sinai case study
Icahn School of Medicine at Mount Sinai and Station X created GenePool, a human genomics data software platform. With AWS as its foundation, the platform can dynamically scale in minutes, and store datasets for translational and clinical genomics customers.
