AWS HPC Blog

Category: High Performance Computing

Figure 2: Identification of redun jobs and grouping them into Array Jobs to run on AWS Batch. (Top) redun represents the workflow as an Expression Graph (top-left), and identifies reductions (red boxes) that are ready to be executed. The redun Scheduler creates a redun Job (J1, J2, J3) for each reduction and dispatches those jobs to Executors based on the task-specific configuration. The Batch Executor allows jobs to accumulate for up to three seconds (default) in order to identify compatible jobs for grouping into an Array Job, which are then submitted to AWS Batch (top-right). (Bottom) As jobs complete in AWS Batch, the success (green) and failure (red) is propagated back to Executors, the Scheduler, and eventually substituted back into the Expression Graph (bottom-left).

Data Science workflows at insitro: how redun uses the advanced service features from AWS Batch and AWS Glue

Matt Rasmussen, VP of Software Engineering at insitro, expands on his first post on redun, insitro’s data science tool for bioinformatics, to describe how redun makes use of advanced AWS features. Specifically, Matt describes how AWS Batch’s Array Jobs is used to support workflows with large fan-out, and how AWS Glue’s DynamicFrame is used to run computationally heterogenous workflows with different back-end needs such as Spark, all in the same workflow definition.

Read More
Figure 1: Evaluating a sequence alignment workflow using graph reduction.** In redun, workflows are represented as an Expression Graph (left) which contain concrete value nodes (grey) and Expression nodes (blue). The redun scheduler identifies tasks that are ready to execute by finding subtrees that can be reduced (red boxes), substituting task results back into the Expression Graph (red arrows). The scheduler continues to find reductions until the Expression Graph reduces to a single concrete value (grey box, far right). If any reduction has been done before (determine by comparing an Expression's hash), the redun scheduler can replay the reduction from a central cache and skip task re-execution.

Data Science workflows at insitro: using redun on AWS Batch

Matt Rasmussen, VP of Software Engineering at insitro describes their recently released, open-source data science framework, redun, which allows data scientists to define complex scientific workflows that scale from their laptop to large-scale distributed runs on serverless platforms like AWS Batch and AWS Glue. I this post, Matt shows how redun lends itself to Bioinformatics workflows which typically involve wrapping Unix-based programs that require file staging to and from object storage. In the next blog post, Matt describes how redun scales to large and heterogenous workflows by leveraging AWS Batch features such as Array Jobs and AWS Glue features such as Glue DynamicFrame.

Read More

Migrating to AWS ParallelCluster v3 – Updated CLI interactions

The AWS ParallelCluster version 3 CLI differs significantly from ParallelCluster version 2. This post provides some guidance on mapping between versions to help you with migrating to ParallelCluster 3. We also summarize new CLI features in ParallelCluster 3 to expose the things you just couldn’t do previously.

Read More

Choosing between AWS Batch or AWS ParallelCluster for your HPC Workloads

It’s an understatement that AWS has a lot of services (more than 200 at the time of this post!). We’re usually the first to point out that there’s more than one way to solve a problem. HPC is no different in this regard, because we offer a choice: customers can run their HPC workloads using AWS […]

Read More

Getting the best OpenFOAM Performance on AWS

OpenFOAM is one the most widely used Computational Fluid Dynamics (CFD) packages and helps companies in a broad range of sectors (automotive, aerospace, energy, and life-sciences) to conduct research and design new products. In this post, we’ll discuss six practical things you can do as an OpenFOAM user to run your simulations faster and more cost effectively.

Read More
Figure 2: AWS HTC-Grid’s Amazon EKS-based Compute Plane

Cloud-native, high throughput grid computing using the AWS HTC-Grid solution

We worked with our financial services customers to develop an open-source, scalable, cloud-native, high throughput computing solution on AWS — AWS HTC-Grid. HTC-Grid allows you to submit large volumes of short and long running tasks and scale environments dynamically. In this first blog of a two-part series, we describe the structure of HTC-Grid and its objective to provide a configurable blueprint for HPC grid scheduling on the cloud.

Read More

Optimize your Monte Carlo simulations using AWS Batch

Introduction Monte Carlo methods are a class of methods based on the idea of sampling to study mathematical problems for which analytical solutions may be unavailable. The basic idea is to create samples through repeated simulations that can be used to derive approximations about a quantity we’re interested in, and its probability distribution. In this […]

Read More

GROMACS performance on Amazon EC2 with Intel Ice Lake processors

We recently launched two new Amazon EC2 instance families based on Intel’s Ice Lake – the C6i and M6i. These instances provide higher core counts and take advantage of generational performance improvements on Intel’s Xeon scalable processor family architectures. In this post we show how GROMACS performs on these new instance families. We use similar methodologies as for previous posts where we characterized price-performance for CPU-only and GPU instances (Part 1, Part 2, Part 3), providing instance recommendations for different workload sizes.

Read More