AWS HPC Blog

Category: Technical How-to

Create a Slurm cluster for semiconductor design with AWS ParallelCluster

Create a Slurm cluster for semiconductor design with AWS ParallelCluster

If you work in the semiconductor industry with electronic design automation tools and workflows, this guide will help you build an HPC cluster on AWS specifically configured for your needs. It covers AWS ParallelCluster and customizations specifically to cater to EDA.

Building a Scalable Predictive Modeling Framework in AWS – Part 2

In the first part of this three-part blog series, we introduced the aws-do-pm framework for building predictive models at scale in AWS. In this blog, we showcase a sample application for predicting the life of batteries in a fleet of electric vehicles, using the aws-do-pm framework.

Optimize your Monte Carlo simulations using AWS Batch

Introduction Monte Carlo methods are a class of methods based on the idea of sampling to study mathematical problems for which analytical solutions may be unavailable. The basic idea is to create samples through repeated simulations that can be used to derive approximations about a quantity we’re interested in, and its probability distribution. In this […]

Running Windows HPC Workloads using HPC Pack in AWS

This blog post shows you how to deploy an HPC cluster for Windows workloads. We have provided an AWS CloudFormation template that automates the creation process to deploy an HPC Pack 2019 Windows cluster. This will help you get started quickly to run Windows-based HPC workloads, while leveraging highly scalable, resilient, and secure AWS infrastructure. As an example, we show how to run a sample parametric sweep for EnergyPlus, an open source energy simulation tool maintained by the U.S. Department of Energy’s Building Technology Office.

Figure 1: High level architecture of the file system.

Scaling a read-intensive, low-latency file system to 10M+ IOPs

Many shared file systems are used in supporting read-intensive applications, like financial backtesting. These applications typically exploit copies of datasets whose authoritative copy resides somewhere else. For small datasets, in-memory databases and caching techniques can yield impressive results. However, low latency flash-based scalable shared file systems can provide both massive IOPs and bandwidth. They’re also easy to adopt because of their use of a file-level abstraction. In this post, I’ll share how to easily create and scale a shared, distributed POSIX compatible file system that performs at local NVMe speeds for files opened read-only.

Using AWS Batch Console Support for Step Functions Workflows

Last year, we published the Genomics Secondary Analysis Using AWS Step Functions and AWS Batch solution as a companion solution to the Genomics Data Transfer, Analytics, and Machine Learning Using AWS Services whitepaper. Since then, many customers have used the secondary analysis solution to automate their bioinformatics pipelines in AWS. A common pain point expressed […]

Cost-optimization on Spot Instances using checkpoint for Ansys LS-DYNA

A major portion of the costs incurred for running Finite Element Analyses (FEA) workloads on AWS comes from the usage of Amazon EC2 instances. Amazon EC2 Spot Instances offer a cost-effective architectural choice, allowing you to take advantage of unused EC2 capacity for up to a 90% discount compared to On-Demand Instance prices. In this post, we describe how you 0can run fault-tolerant FEA workloads on Spot Instances using Ansys LS-DYNA’s checkpointing and auto-restart utility.