AWS HPC Blog

Category: Industries

Benchmarking NVIDIA Clara Parabricks Somatic Variant Calling Pipeline on AWS

Somatic variants are genetic alterations which are not inherited but acquired during one’s lifespan, for example those that are present in cancer tumors. In this post, we will demonstrate how to perform somatic variant calling from matched tumor and normal genome sequence data, as well as tumor-only whole genome and whole exome datasets using an NVIDIA GPU-accelerated Parabricks pipeline, and compare the results with baseline CPU-based workflows.

Figure 2: Identification of redun jobs and grouping them into Array Jobs to run on AWS Batch. (Top) redun represents the workflow as an Expression Graph (top-left), and identifies reductions (red boxes) that are ready to be executed. The redun Scheduler creates a redun Job (J1, J2, J3) for each reduction and dispatches those jobs to Executors based on the task-specific configuration. The Batch Executor allows jobs to accumulate for up to three seconds (default) in order to identify compatible jobs for grouping into an Array Job, which are then submitted to AWS Batch (top-right). (Bottom) As jobs complete in AWS Batch, the success (green) and failure (red) is propagated back to Executors, the Scheduler, and eventually substituted back into the Expression Graph (bottom-left).

Data Science workflows at insitro: how redun uses the advanced service features from AWS Batch and AWS Glue

Matt Rasmussen, VP of Software Engineering at insitro, expands on his first post on redun, insitro’s data science tool for bioinformatics, to describe how redun makes use of advanced AWS features. Specifically, Matt describes how AWS Batch’s Array Jobs is used to support workflows with large fan-out, and how AWS Glue’s DynamicFrame is used to run computationally heterogenous workflows with different back-end needs such as Spark, all in the same workflow definition.

Figure 1: Evaluating a sequence alignment workflow using graph reduction.** In redun, workflows are represented as an Expression Graph (left) which contain concrete value nodes (grey) and Expression nodes (blue). The redun scheduler identifies tasks that are ready to execute by finding subtrees that can be reduced (red boxes), substituting task results back into the Expression Graph (red arrows). The scheduler continues to find reductions until the Expression Graph reduces to a single concrete value (grey box, far right). If any reduction has been done before (determine by comparing an Expression's hash), the redun scheduler can replay the reduction from a central cache and skip task re-execution.

Data Science workflows at insitro: using redun on AWS Batch

Matt Rasmussen, VP of Software Engineering at insitro describes their recently released, open-source data science framework, redun, which allows data scientists to define complex scientific workflows that scale from their laptop to large-scale distributed runs on serverless platforms like AWS Batch and AWS Glue. I this post, Matt shows how redun lends itself to Bioinformatics workflows which typically involve wrapping Unix-based programs that require file staging to and from object storage. In the next blog post, Matt describes how redun scales to large and heterogenous workflows by leveraging AWS Batch features such as Array Jobs and AWS Glue features such as Glue DynamicFrame.

Figure 2: CDI transmits the frame buffer using EFA. SRD is a multipath, self-healing transport. This creates a kernel bypass method that effectively enables a memory copy from one framebuffer to another.

How we enabled uncompressed live video with CDI over EFA

We’re going to take you into the world of broadcast video, and explain how it led to us announcing today the general availability of EFA on smaller instance sizes. For a range of applications, this is going to save customers a lot of money because they no longer need to use the biggest instances in each instance family to get HPC-style network performance. But the story of how we got there involves our Elastic Fabric Adapter (EFA), some difficult problems presented to us by customers in the entertainment industry, and an invention called the Cloud Digital Interface (CDI). And it started not very far from Hollywood.

Running a 3.2M vCPU HPC Workload on AWS with YellowDog

OMass Therapeutics, a biotechnology company identifying medicines against highly validated target ecosystems, used Yellowdog on AWS to analyze and screen 337 million compounds in 7 hours, a task which would have taken two months using an on-premises HPC cluster. YellowDog, based in Bristol in the UK, ran the drug discovery application on an extremely large, multi-region cluster in AWS with the AWS ‘pay-as-you-go’ pricing model. It provided a central, unified interface to monitor and manage AWS Region selection, compute provisioning, job allocation and execution. The entire workload completed in 65 minutes, enabling scientists to start work on analysis the same day, significantly accelerating the drug discovery process. In this post, we’ll discuss the AWS and YellowDog services we deployed, and the mechanisms used to scale to 3.2m vCPUs using multiple EC2 instance types across multiple regions in 33 minutes, running at a 95% utilization rate.

The Convergent Evolution of Grid Computing in Financial Services

The Financial Services industry makes significant use of high performance computing (HPC) but it tends to be in the form of loosely coupled, embarrassingly parallel workloads to support risk modelling. The infrastructure tends to scale out to meet ever increasing demand as the analyses look at more and finer grained data. At AWS we’ve helped many customers tackle scaling challenges are noticing some common themes. In this post we describe how HPC teams are thinking about how they deliver compute capacity today, and highlight how we see the solutions converging for the future.

Virtual Screening of Novel Active Drug Compounds on AWS with Orion®

Computer-aided drug discovery (CADD) has been a key player in lowering the cost and speeding up the timeline for drug development. CADD uses high performance computing (HPC) resources to virtually screen databases with billions of molecules. It can speed up the searching of potential drug molecules, and filter out molecules and compounds that are unsuitable. OpenEye Scientific developed Orion®, a cloud-based molecular design platform for CADD. Orion provides computational chemists with virtually unlimited HPC resources. These include data visualization, collaboration, and workflow management tools that help them perform calculations more efficiently. In this post, we describe the Orion architecture on AWS, and it’s capabilities to address the challenges in drug development.

Price-Performance Analysis of Amazon EC2 GPU Instance Types using NVIDIA’s GPU optimized seismic code

Seismic imaging is the process of positioning the Earth’s subsurface reflectors. It transforms the seismic data recorded in time at the Earth’s surface to an image of the Earth’s subsurface. This is done by back-propagating data from time to space in a given velocity model. Kirchhoff depth migration is a well-known technique used in geophysics for seismic imaging. Kirchhoff time and depth migration produce an image with higher resolution and generate an image of the subsurface for a subset class of the data, providing valuable information about the petrophysical properties of the rocks and helps to determine how accurate the velocity model is. This blog post looks at the price-performance characteristics computing Kirchhoff migration methods on GPUs using Nvidia’s GPU-optimized code.