AWS HPC Blog
Category: Life Sciences
Data Science workflows at insitro: using redun on AWS Batch
Matt Rasmussen, VP of Software Engineering at insitro describes their recently released, open-source data science framework, redun, which allows data scientists to define complex scientific workflows that scale from their laptop to large-scale distributed runs on serverless platforms like AWS Batch and AWS Glue. I this post, Matt shows how redun lends itself to Bioinformatics workflows which typically involve wrapping Unix-based programs that require file staging to and from object storage. In the next blog post, Matt describes how redun scales to large and heterogenous workflows by leveraging AWS Batch features such as Array Jobs and AWS Glue features such as Glue DynamicFrame.
Creating a digital map of COVID-19 virus for discovery of new treatment compounds
Quantum physics and high-performance computing have slashed research times for a consortium of researchers led by Qubit Pharmaceuticals. This post describes the discovery of chemical substances that may lead to new COVID-19 treatments in only six months using cloud technology.
Running a 3.2M vCPU HPC Workload on AWS with YellowDog
OMass Therapeutics, a biotechnology company identifying medicines against highly validated target ecosystems, used Yellowdog on AWS to analyze and screen 337 million compounds in 7 hours, a task which would have taken two months using an on-premises HPC cluster. YellowDog, based in Bristol in the UK, ran the drug discovery application on an extremely large, multi-region cluster in AWS with the AWS ‘pay-as-you-go’ pricing model. It provided a central, unified interface to monitor and manage AWS Region selection, compute provisioning, job allocation and execution. The entire workload completed in 65 minutes, enabling scientists to start work on analysis the same day, significantly accelerating the drug discovery process. In this post, we’ll discuss the AWS and YellowDog services we deployed, and the mechanisms used to scale to 3.2m vCPUs using multiple EC2 instance types across multiple regions in 33 minutes, running at a 95% utilization rate.
Virtual Screening of Novel Active Drug Compounds on AWS with Orion®
Computer-aided drug discovery (CADD) has been a key player in lowering the cost and speeding up the timeline for drug development. CADD uses high performance computing (HPC) resources to virtually screen databases with billions of molecules. It can speed up the searching of potential drug molecules, and filter out molecules and compounds that are unsuitable. OpenEye Scientific developed Orion®, a cloud-based molecular design platform for CADD. Orion provides computational chemists with virtually unlimited HPC resources. These include data visualization, collaboration, and workflow management tools that help them perform calculations more efficiently. In this post, we describe the Orion architecture on AWS, and it’s capabilities to address the challenges in drug development.
GROMACS price-performance optimizations on AWS
Molecular dynamics (MD) is a simulation method for analyzing the movement and tracing trajectories of atoms and molecules where the dynamics of a system evolve over time. MD simulations are used across various domains such as material sciences, biochemistry, biophysics and are typically used in two broad ways to study a system. The importance of […]