AWS Public Sector Blog

Updates and early lessons from our COVID-19 HPC Consortium research partners

The concept of a COVID-19 High Performance Computing (HPC) Consortium emerged from a roundtable discussion at the White House in March and included input from industry, government, and academic leaders. Following the announcement of the consortium, AWS has been collaborating with teams on a growing number of projects to provide cloud computing resources from AWS.

I want to share three early learnings from this work.

  1. When empowered with the right tools, researchers can significantly accelerate the pace of their work. Tools including deep learning and artificial intelligence are reducing time to insight for researchers around the world.
  2. There is no silver bullet. The diversity of the efforts below reflects the many dimensions to the health challenge posed by COVID-19.
  3. We are only scratching the surface of what’s possible. The work of the HPC Consortium demonstrates the commitment of a community of researchers and public sector leaders to test new ideas, iterate on others, and tackle the challenge of COVID-19 head-on.

Today I would like to provide some insight into some of the innovative projects on which we are collaborating with the world’s leading researchers. Read more about each individual effort, below.

COVID Moonshot Request

Lead researcher: PostEra

PostEra is helping to lead Project Moonshot, an open science initiative to find a novel antiviral cure for COVID-19, tested in animals within four months. The worldwide scientific community suggests drug candidates that might bind to, and neutralize, the COVID-19 main protease. However, machine learning resources are necessary to predict how to synthesize these suggested compounds. This can reduce the time to determine optimal ways to make these compounds from weeks to days, and would be the first time that a drug was developed in an open-sourced fashion.

Designing virus-specific sACE2 mimics for competitive inhibition of SARS-CoV-2

Lead researcher: Massachusetts Institute of Technology

Soluble angiotensin-converting enzyme 2 (sACE2) can inhibit SARS and SARS2 coronaviruses. Although there is an ongoing clinical trial of sACE2 for this purpose, it has other properties that are likely to limit its safety and clinical effectiveness. The goal of this project is to design multiple sACE2 mutants that can be used as safe and effective antiviral therapeutics. This project uses deep learning to predict sACE2 variants that are effective with few to no side effects.

SARS-CoV-2 Detection in Sequence Read Archive Data

Lead researcher: Los Alamos National Lab (LANL)

This project uses a genomic data driven technique to analyze the temporal and spatial dynamics of the COVID-19 pandemic. Researchers will analyze all the data deposited over the last six months to the Sequence Read Archive (SRA), a repository of genomics data. Pathogen detection pipelines will detect which samples are likely to contain SARS-CoV-2. Results from these pipelines will be combined with sample metadata, providing a “map” of where the virus has been identified.

Target Identification for Broad Antiviral Therapy using Functional Genetic Screening Datasets

Lead researcher: Children’s National Medical Centre

It is important to understand how human genes interact with a virus to develop antiviral therapy. Functional genetic screening is a high-throughput and cost-effective way of studying how a virus interacts with the functions of many genes before and after infection. The goal of this proposal is to reanalyze public genetic screening datasets to identify genes that serve as potential targets of broad antiviral activity including COVID-19.

tmCOVID: a text mining tool for COVID-19 scientific literature

Lead researcher: Emory University

More than 2,700 peer-reviewed articles related to COVID-19 have been published since last December, which is too much information for researchers to explore manually in any reasonable timeframe. The goal of this project is to develop and host an interactive web-based text mining tool to be used for automated extraction and summarization of key bio-concepts (genes, chemicals, diseases, species, mutations, and cell lines) in COVID-19 scientific literature to make it easier to quickly extract information and summarize it. A prototype of this tool is available at www.tmcovid.com; this project will incorporate capabilities for full-text document summarization and generate textual summaries based on the user query.

The evolutionary history of SARS-CoV-2

Lead researcher: Iowa State University

The SARS-CoV-2 virus replicates and mutates as it spreads, and some variants impact how the virus impacts hosts. However, we do not know much about how different strains of the virus – some potentially more virulent than others – are distributed across the globe. Using a variety of evolutionary models together with publicly available genomic datasets, researchers will construct and release the phylogenetic relationship of strains of SARS-CoV-2 and make this available on a public website.

Combined virtual screening and machine learning approach to finding novel SARS-CoV-2 protease inhibitors.

Lead researcher: Kuano

The goal of this proposal is to combine machine learning and molecular modeling to improve virtual screening and drug discovery applications targeting COVID-19. The team has developed a genetic algorithm capable of searching the chemical space surrounding existing antiviral drugs, and a deep learning-based classification model based on existing binding data. These tools will be combined with docking and molecular dynamics simulations to enhance results. The output of this effort will be a set of machine learning models that can be used to score molecular designs for synthesis as antiviral compounds.

Compounds to prevent the spread of COVID-19

Lead researcher: Novel Tech Sciences

Novel Tech Sciences is applying molecular dynamics simulations to identify antiviral compounds from ayurvedic medicine that might bind to the SARS-CoV-2 protein targets. The goal is to create a blend of these phytochemicals derived from natural sources that can be used as a therapy and to help prevent spread of COVID-19.

 

Learn more about the COVID-19 High Performance Computing (HPC) Consortium.