AWS Public Sector Blog
Tag: genomics
Solving medical mysteries in the AWS Cloud: Medical data-sharing innovation through the Undiagnosed Diseases Network
It takes a medical village to discover and diagnose rare diseases. The National Institutes of Health’s Undiagnosed Diseases Network (UDN) is made up of a coordinating center, 12 clinical sites, a model organism screening center, a metabolomics core, a sequencing core, and a biorepository. For many years prior to the UDN, the experts at these sites were limited by antiquated data-sharing procedures. The UDN leadership realized that if they wanted to scale up and serve as many patients as possible, they needed to transform how they process, store, and share medical data—which led the UDN to the AWS Cloud.
How to set up Galaxy for research on AWS using Amazon Lightsail
Galaxy is a scientific workflow, data integration, and digital preservation platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system, running on everything from academic mainframes to personal computers. But researchers and organizations may worry about capacity and the accessibility of compute power for those with limited or restrictive budgets. In this blog post, we explain how to implement Galaxy on the cloud at a predictable cost within your research or grant budget with Amazon Lightsail.
Top announcements from the AWS Public Sector Partners leadership session at re:Invent 2021
During the 10th anniversary of re:Invent, I was thrilled to share announcements and achievements from AWS Partners and programs for the public sector around the world. Since its launch, AWS’s Public Sector Partner Program participation has increased by an average of 54% year over year, with partners providing solutions in mission areas across healthcare, space, energy, transportation, government, education, and nonprofit. In both the Global Partners Summit keynote at re:Invent 2021, as well as in my public sector leadership session, I highlighted the new and upcoming AWS Partner solutions and accomplishments.
Cloud powers faster, greener, and more collaborative research, according to new IDC report
According to a new IDC report, the cloud is helping researchers conduct research faster than ever before by reducing data analysis and processing times, and is allowing researchers around the world to collaborate on solving universal problems. In addition to the positive impact on research, IDC also forecasts that continued adoption of cloud computing globally could prevent environmental emission of more than 1 billion metric tons of CO2 from 2021 through 2024, almost equivalent to removing the 2020 CO2 emissions of Germany and the U.K. combined.
Climate data, koala genomes, analysis ready radar data, and highly-queryable genomic data: The latest open data on AWS
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. We work with data providers to democratize access to data by making it available to the public for analysis on AWS; develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Our full list of publicly available datasets are on the Registry of Open Data on AWS. This quarter, we released 26 new or updated datasets including datasets on climate, koala genomes, analysis ready radar data, and highly-queryable genomic data. Check out some highlights.
Driving innovation in single-cell analysis on AWS
Computational biology is undergoing a revolution. However, the analysis of single cells is a hard problem to solve. Standard statistical techniques used in genomic analysis fail to capture the complexity present in single-cell datasets. Open Problems in Single-Cell Analysis is a community-driven effort using AWS to drive the development of novel methods that leverage the power of single-cell data.
Accelerating genome assembly with AWS Graviton2
One of the biggest scientific achievements of the twenty-first century was the completion of the Human Genome Project and the publication of a draft human genome. The project took over 13 years to complete and remains one of the largest private-public international collaborations ever. Advances since in sequencing technologies, computational hardware, and novel algorithms reduced the time it takes to produce a human genome assembly to only a few days, at a fraction of the cost. This made using the human genome draft for precision and personalized medicine more achievable. In this blog, we demonstrate how to do a genome assembly in the cloud in a cost-efficient manner using ARM-based AWS Graviton2 instances.
A generalized approach to benchmarking genomics workloads in the cloud: Running the BWA read aligner on Graviton2
The AWS Cloud gives genomics researchers access to a wide variety of instance types and chip architectures and this elasticity allows us to rethink genomics workflows when running workloads in the cloud. Given the increased performance of the Graviton2 instances, we wanted to explore if they can be used for cost-effective and performant genomics workloads. Read on to learn about our generalized approach for determining the most effective instance type for running genomics workloads in the cloud.
Taking COVID in STRIDES: The National Center for Biotechnology Information makes coronavirus genomic data available on AWS
AWS and the National Institutes of Health’s (NIH) National Center for Biotechnology Information (NCBI) announced the creation of the Coronavirus Genome Sequence Dataset to support COVID-19 research. The dataset is hosted by the AWS Open Data Sponsorship Program and accessible on the Registry of Open Data on AWS, providing researchers quick and easy access to coronavirus sequence data at no cost for use in their COVID-19 research.
Stanford researchers accelerate autism research by sharing genomic data in the cloud
In 2014, the Wall Lab at Stanford University sought to answer one of the most pressing questions in neuroscience: What genes influence autism spectrum disorder (ASD)? According to the Centers for Disease Control (CDC), this neurodevelopmental disorder affects roughly one in 54 children in America and is on the rise—nearly tripling since 1992. In the lab’s study of ASD genetics, they chose the cloud—and a unique experimental approach—to speed the time to science.