AWS Public Sector Blog

Tag: genomics

Building a resilient and scalable clinical genomics analysis pipeline with AWS

At the Baylor College of Medicine Human Genome Sequencing Center (BCM HGSC), we aim to advance precision medicine and research in genomics. In that effort, we joined the ambitious All of Us Research Program funded by the National Institutes of Health (NIH) to help deliver genomic data to over one million individuals across the United States. In early 2019, we estimated that processing whole genome samples for this megaproject would imply a scale-up of over four times the production workload of our center. We used AWS to support our new pipeline demands, which saved time, reduced costs, and created new opportunities for future development.

Read More

OpenFold, OpenAlex catalog of scholarly publications, and Capella Space satellite data: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. Our full list of publicly available datasets are on the Registry of Open Data on AWS and are now also discoverable on AWS Data Exchange. This quarter, we released 15 new or updated datasets including OpenFold, OpenAlex, and radar data from Capella Space. Check out some highlights from the new or updated datasets.

Read More

Downscaled CMIP5, 1950 US Census, and open genomics data for Galaxy: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on Amazon Web Services (AWS). Our full list of publicly available datasets are on the Registry of Open Data on AWS. This quarter, we released 13 new or updated datasets including CMIP5, 1950s US Decennial Census, and open genomics data for Galaxy. Read on for some highlights.

Read More

Preventing the next pandemic: How researchers analyze millions of genomic datasets with AWS

How do we avoid the next global pandemic? For researchers collaborating with the University of British Columbia Cloud Innovation Center (UBC CIC), the answer to that question lies in a massive library of genetic sequencing data. But there is a problem: the data library is so massive that traditional computing can’t comprehensively analyze or process it. So the UBC CIC team collaborated with computational virologists to create Serratus, an open-science viral discovery platform to transform the field of genomics—built on the massive computational power of the Amazon Web Services (AWS) Cloud.

Read More

Solving medical mysteries in the AWS Cloud: Medical data-sharing innovation through the Undiagnosed Diseases Network

It takes a medical village to discover and diagnose rare diseases. The National Institutes of Health’s Undiagnosed Diseases Network (UDN) is made up of a coordinating center, 12 clinical sites, a model organism screening center, a metabolomics core, a sequencing core, and a biorepository. For many years prior to the UDN, the experts at these sites were limited by antiquated data-sharing procedures. The UDN leadership realized that if they wanted to scale up and serve as many patients as possible, they needed to transform how they process, store, and share medical data—which led the UDN to the AWS Cloud.

Read More

How to set up Galaxy for research on AWS using Amazon Lightsail

Galaxy is a scientific workflow, data integration, and digital preservation platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system, running on everything from academic mainframes to personal computers. But researchers and organizations may worry about capacity and the accessibility of compute power for those with limited or restrictive budgets. In this blog post, we explain how to implement Galaxy on the cloud at a predictable cost within your research or grant budget with Amazon Lightsail.

Read More

Top announcements from the AWS Public Sector Partners leadership session at re:Invent 2021

During the 10th anniversary of re:Invent, I was thrilled to share announcements and achievements from AWS Partners and programs for the public sector around the world. Since its launch, AWS’s Public Sector Partner Program participation has increased by an average of 54% year over year, with partners providing solutions in mission areas across healthcare, space, energy, transportation, government, education, and nonprofit. In both the Global Partners Summit keynote at re:Invent 2021, as well as in my public sector leadership session, I highlighted the new and upcoming AWS Partner solutions and accomplishments.

Read More

Cloud powers faster, greener, and more collaborative research, according to new IDC report

According to a new IDC report, the cloud is helping researchers conduct research faster than ever before by reducing data analysis and processing times, and is allowing researchers around the world to collaborate on solving universal problems. In addition to the positive impact on research, IDC also forecasts that continued adoption of cloud computing globally could prevent environmental emission of more than 1 billion metric tons of CO2 from 2021 through 2024, almost equivalent to removing the 2020 CO2 emissions of Germany and the U.K. combined.

Read More
koala in tree

Climate data, koala genomes, analysis ready radar data, and highly-queryable genomic data: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. We work with data providers to democratize access to data by making it available to the public for analysis on AWS; develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Our full list of publicly available datasets are on the Registry of Open Data on AWS. This quarter, we released 26 new or updated datasets including datasets on climate, koala genomes, analysis ready radar data, and highly-queryable genomic data. Check out some highlights.

Read More

Driving innovation in single-cell analysis on AWS

Computational biology is undergoing a revolution. However, the analysis of single cells is a hard problem to solve. Standard statistical techniques used in genomic analysis fail to capture the complexity present in single-cell datasets. Open Problems in Single-Cell Analysis is a community-driven effort using AWS to drive the development of novel methods that leverage the power of single-cell data.

Read More