AWS HPC Blog

Alberto Falzone

Author: Alberto Falzone

Alberto Falzone is an HPC Professional Services Consultant at Amazon Web Services. He started out working for Nice Software in 2001 and joined AWS in 2016 throught the acquisition of Nice Software.

Using Spot Instances with AWS ParallelCluster and Amazon FSx for Lustre

Processing large amounts of complex data often requires leveraging a mix of different Amazon EC2 instance types. These types of computations also benefit from shared, high performance, scalable storage like Amazon FSx for Lustre. A way to save costs on your analysis is to use Amazon EC2 Spot Instances, which can help to reduce EC2 costs up to 90% compared to On-Demand Instance pricing. This post will guide you in the creation of a fault-tolerant cluster using AWS ParallelCluster. We will explain how to configure ParallelCluster to automatically unmount the Amazon FSx for Lustre filesystem and resubmit the interrupted jobs back into the queue in the case of Spot interruption events.

Building highly-available HPC infrastructure on AWS

In this blog post, we will explain how to launch highly available HPC clusters across an AWS Region. The solution is deployed using the AWS Cloud Developer Kit (AWS CDK), a software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation, hiding the complexity of integration between the components.