AWS HPC Blog
Category: AWS ParallelCluster
The plumbing: best-practice infrastructure to facilitate HPC on AWS
If you want to build enterprise-grade HPC on AWS, what’s the best path to get started? Should you create a new AWS account and build from scratch? In this post we’ll walk you through the best practices for getting setup cleanly from the start.
Automate your clusters by creating self-documenting HPC with AWS ParallelCluster
Today we’re going to show you how you can automate cluster deployment and create self-documenting infrastructure at the same time, which leads to more repeatable results that are easier to manage (and replicate).
Running protein structure prediction at scale using a web interface for researchers
Today, we’ll show you our open-source sample implementation of a web frontend and cloud HPC backend to support researchers using AI tools like AlphaFold for drug discovery and design.
Instance sizes in the Amazon EC2 Hpc7 family – a different experience
Hpc7g is the first Amazon EC2 HPC instance offering with multiple instance sizes, but this is quite different from the experience of getting smaller instances from other non-HPC instance families. Today, we want to take a moment to explore why this is different, and how it helps.
Customize Slurm settings with AWS ParallelCluster 3.6
With AWS ParallelCluster 3.6, you can directly specify Slurm settings in the cluster config file – improving reproducibility and another step towards self-documentation for your HPC infrastructure.
Introducing GPU health checks in AWS ParallelCluster 3.6
AWS ParallelCluster 3.6.0 can now detect GPU failures in HPC and AI/ML tasks. Health checks run at the start of Slurm jobs and if they fail, the job is requeued on another instance. This can increase reliability and prevent wasted spend.
Elastic visualization queues with NICE DCV in AWS ParallelCluster
In this blog post we’ll show you how to create an elastic pool of visualization nodes, by combining AWS ParallelCluster with NICE DCV in a novel way.
Checkpointing HPC applications using the Spot Instance two-minute notification from Amazon EC2
In this post we show you how to create an HPC cluster and capture the two-minute warning notifications from Amazon EC2 Spot to execute a checkpoint, reactively.
Install optimized software with Spack configs for AWS ParallelCluster
Today, we’re announcing the availability of Spack configs for AWS ParallelCluster. You can use these configurations to install optimized HPC applications quickly and easily on your AWS-powered HPC clusters.
Deploying Open OnDemand with AWS ParallelCluster
In this post, we describe an integration of Open OnDemand with AWS ParallelCluster so admins can provide web-based access to HPC resources beyond what they have at their site, by using the AWS cloud to add new capabilities and extend capacity.