AWS News Blog

Next Generation Genomics With AWS

My colleague Matt Wood wrote a great guest post to announce new support for one of our genomics partners.

Jeff;


I am happy to announce that AWS will be supporting the work of our partner, Seven Bridges Genomics, who has been selected as one of the National Cancer Institute (NCI) Cancer Genomics Cloud Pilots. The cloud has become the new normal for genomics workloads, and AWS has been actively involved since the earliest days, from being the first cloud vendor to host the 1000 Genomes Project, to newer projects like designing synthetic microbes, and development of novel genomics algorithms that work at population scale. The NCI Cancer Genomics Cloud Pilots are focused on how the cloud has the potential to be a game changer in terms of scientific discovery and innovation in the diagnosis and treatment of cancer.

The NCI Cancer Genomics Cloud Pilots will help address a problem in cancer genomics that is all too familiar to the wider genomics community: data portability. Today’s typical research workflow involves downloading large data sets, (such as the previously mentioned 1000 Genomes Project or The Cancer Genome Atlas (TCGA)) to on-premises hardware, and running the analysis locally. Genomic datasets are growing at an exponential rate and becoming more complex as phenotype-genotype discoveries are made, making the current workflow slow and cumbersome for researchers. This data is difficult to maintain locally and share between organizations. As a result, genomic research and collaborations have become limited by the available IT infrastructure at any given institution.

The NCI Cancer Genomics Cloud Pilots will take the natural step to solve this problem, by bringing the computation to where the data is, rather than the other way around. The goal of the NCI Cancer Genomics Cloud Pilots are to create cloud-hosted repositories for cancer genome data that reside alongside the tools, algorithms, and data analysis pipelines needed to make use of the data. These Pilots will provide ways to provision computational resources within the cloud so that researchers can analyze the data in place. By collocating data in the cloud with the necessary interface, algorithms, and self-provisioned resources, these Pilots will remove barriers to entry, allowing researchers to more easily participate in cancer research and accelerating the pace of discovery. This means more life-saving discoveries such as better ways to diagnose stomach cancer, or the identification of novel mutations in lung cancer that allow for new drug targets.

The Pilots will also allow cancer researchers to provision compute clusters that change as their research needs change. They will have the necessary infrastructure to support their research when they need it, rather than make a guess at the resources that they will need in the future every time grant writing season starts. They will also be able to ask many more novel questions of the data, now that they are no longer constrained by a static set of computational resources.

Finally, the NCI Cancer Genomics Pilots will help researchers collaborate. When data sets are publicly shared, it becomes simple to exchange and share all the tools necessary to reproduce and expand upon another lab’s work. Other researchers will then be able to leverage that software within the community, or perhaps even in an unrelated field of study, resulting in even more ideas be generated.

Since 2009, Seven Bridges Genomics has developed a platform to allow biomedical researchers to leverage AWS’s cloud infrastructure to focus on their science rather than managing computational resources for storage and execution. Additionally, Seven Bridges has developed security measures to ensure compliance with Health Insurance Portability and Accountability Act (HIPAA) for all data stored in the cloud. For the NCI Cancer Genomics Cloud Pilots, the team will adapt the platform to meet the specific needs of the cancer research community as the develop over the course of the Pilot. If you are interested in following the work being done by Seven Bridges Genomics or giving feedback as their work on the NCI Cancer Genomics Cloud Pilots progresses, you can do so here.

We look forward to the journey ahead with Seven Bridges Genomics. You can learn more about AWS and Genomics here.

— Matt Wood, General Manager, Data Science