AWS Public Sector Blog
Breaking Down Barriers: How AWS Democratizes Genomic Data for the World

Part 1 of 3: Democratizing Access to Genomic Data and Analytics
Every human deserves access to innovations that could save their life. Yet for decades, groundbreaking genomic research remained locked behind institutional walls, accessible only to well-funded laboratories.
At Amazon Web Services (AWS), we believe global IT infrastructure and advanced analytical capabilities are necessary tools to address these challenges. We’re transforming how researchers discover treatments, how clinicians diagnose conditions, and, ultimately, how millions of people receive care.
This blog is the first in a three-part series that explores how AWS is transforming genomic research through democratized data access, sovereign-by-design outbreak intelligence platforms, and population-scale biobanks that enable global collaboration while maintaining data security and privacy.
The Challenge: A World of Data, Islands of Access
Genomics has become a big data industry, and researchers everywhere face multifaceted challenges:
- Volume of storage. A single Next-Generation Sequencing (NGS) instrument can produce over 1TB of data per day, with estimates predicting organizations will generate between two and 40 exabytes of genomic data within the next decade.
- Computational barriers. Analyzing genomic data requires massive computing power previously available only to major research institutions.
- Geographic and sovereignty barriers. Researchers in low- and middle-income countries often lack infrastructure, but countries prefer to maintain control over sensitive health data while still enabling collaborative research
When only wealthy institutions can analyze genomic data, medical breakthroughs apply only to smaller, wealthier populations, leaving billions underserved by the promise of precision medicine.
Cloud Infrastructure for Global Health Equity
AWS addresses these challenges through open data initiatives, purpose-built genomic services, and strategic credit programs that put powerful technology within reach of organizations supporting underserved populations.
AWS Registry of Open Data: 95+ Genomic Datasets and Growing
The AWS Registry of Open Data democratizes data access. The registry hosts 95+ genomic data sources—including the Sequence Read Archive (40PB+), Cancer Genome Atlas (2.5PB+), and Human Cell Atlas (300TB)—making them readily available in Amazon Simple Storage Service (Amazon S3). Researchers can analyze data where it lives, dramatically reducing time-to-insight and costs.
AWS Services for Genomic Workloads
- AWS HealthOmics – A fully managed service for specialized genomic storage and managed workflow execution at scale.
- High-performance computing – AWS Batch, AWS ParallelCluster, and GPU/FPGA-accelerated Amazon Elastic Compute Cloud (Amazon EC2) instances for variant calling and gene expression analysis.
- Secure, compliant storage – Amazon S3 and Amazon Glacier Deep Archive for cost-effective long-term storage.
- Amazon Nova Forge – For developing specialized domain models, from drug discovery assistants to custom genomics models.
Real-World Impact: Opening Doors for Researchers Worldwide
At AWS, we believe our cloud and AI services are powerful tools to address the world’s urgent and complex health challenges.
The true measure of democratization isn’t in technology specifications, it’s in whose lives are changed. In 2021, we launched the AWS Health Equity Initiative (HEI), a three-year $60M commitment to advance global health equity.
Here are two examples of these investments:
Korea University: Advancing Female Autism Research in East Asia
Autism research historically focused on male subjects and Western populations, leaving critical gaps in our understanding of how the condition manifests in women and in East Asian populations. Dr. Joon-Yong An at Korea University set out to change this by analyzing 1.4 petabytes of genomic data to identify sex-specific genetic factors.
Using AWS credits, Dr. An’s team processed whole-genome sequencing data from over 42,000 individuals across Korean and international autism cohorts, facilitating large-scale collaborative studies in East Asia where autism research is limited.
Using scalable AWS services like EC2 with GPU acceleration and S3, they conducted a Category-Wide Association Study framework to prioritize noncoding variants associated with autism. The team also trained high-performance deep learning models to predict sex-specific genetic factors.
“By efficiently processing large-scale genomic datasets, our solution accelerates the discovery of female-specific risk factors, facilitating more accurate diagnoses and personalized interventions. Ultimately, our findings will contribute to reducing health disparities and improving health outcomes for autistic individuals,” reported Dr. An.
“Our work opened new avenues for researchers to explore sex-related genetic factors in various neurodevelopmental and psychiatric conditions. By democratizing access to computationally intensive genomic research, AWS empowered us to bridge gender disparities in autism diagnosis and treatment, ensuring that overlooked populations receive the medical attention and resources they deserve.”
Imagenomix: Precision Cancer Diagnostics for All Backgrounds
Approximately 98% of global cancer patients lack access to targeted genetic testing, largely due to the high cost (~$6,000 per patient) and slow turnaround (33+ days) of conventional NGS panels.
Using AWS infrastructure, Imagenomix developed IGX Predict™, an AI-powered platform that analyzes standard pathology slides to predict gene mutations in as little as three minutes, at a fraction of the cost of traditional Next-Generation Sequencing (NGS). By training their models on diverse patient populations, Imagenomix ensures their diagnostic tools work accurately across all racial and ethnic backgrounds.
The social impact of this approach is profound. IGX Predict dramatically lowers barriers to patients’ access to NGS by achieving a 100% tissue success rate compared to a 26% failure rate with traditional methods. This enables clinicians to rapidly identify the right patients for clinical trials and targeted therapies, regardless of where they are in the world.
“Finding the right patients is the biggest bottleneck in drug development, and that bottleneck disproportionately excludes patients from underrepresented populations,” said Travis Wold, CEO of Imagenomix. “Our mission is to make precision cancer diagnostics accessible to everyone — not just the few — by turning a standard pathology slide into genetic insight in minutes.”
With a growing product pipeline spanning lung, breast, and brain cancers, Imagenomix is positioning itself as the new standard in precision access, ensuring that advances in genomic medicine reach every patient, everywhere.
Bridging Data and Discovery: The AWS Open Data-NVIDIA Knowledge Graph Hackathon
Dr. An’s autism research and Imagenomix’s AI driven precision diagnostics platform, showcase how AWS credit programs and scalable infrastructure democratize computational power for genomic discovery. The second pillar of democratization — open access to foundational genomic datasets hosted freely on AWS — creates opportunities for collaborative innovation at unprecedented scale, exemplified by a hackathon that united researchers globally to advance trustworthy AI in biomedicine.
In October 2025, AWS partnered with NVIDIA to host a transatlantic hackathon bringing together 53 researchers from the US and UK. Over three days, seven teams built prototype systems combining knowledge graphs with GraphRAG to make AI outputs more trustworthy in biomedical research, using Amazon Neptune, Open Data on AWS, and NVIDIA’s PyTorch Geometric RAG resources.
Teams created solutions addressing critical challenges, from GeNETwork’s precision oncology knowledge graph integrating cancer genomics and pharmacological data, to BioGraphRAG’s citation-supported biomedical question answering system.
The Path Forward: From Access to Action
Democratizing access to genomic data is just the beginning. Part 2 of this blog series explores how AWS enables global pathogen surveillance and outbreak intelligence through sovereign-by-design platforms, which allow countries to collaborate on infectious disease tracking while maintaining control over their sensitive health data within national borders.
The future of global health depends on ensuring that genomic insights benefit everyone, not just those in wealthy nations or well-funded institutions. Through open data initiatives, purpose-built services, and strategic support for underserved researchers, AWS helps build that future one genome at a time.
- Learn more about AWS Social Impact
- Learn more about Registry of Open Data on AWS
- Learn more about AWS for Healthcare & Life Sciences
About the Authors:
This blog series was developed by AWS Skilling and Social Impact (SSI) in collaboration with AWS Healthcare and Life Sciences specialists. AWS SSI exists to fuel, align, and amplify the good that AWS does in the world, helping organizations transform health outcomes through cloud and AI technology.