AWS HPC Blog

Category: AWS ParallelCluster

Recent improvement to Open MPI AllReduce and the impact to application performance

Recent improvement to Open MPI AllReduce and the impact to application performance

Our team engineered some Open MPI optimizations for EFA to enhance performance of HPC codes running in the cloud. By improving MPI_AllReduce they improved scaling – matching commercial MPIs. Tests show gains for apps including Code Saturne and OpenFOAM on both Arm64 and x86 instances. Check out how these tweaks can speed up your HPC workloads in the cloud.

Securing HPC on AWS – isolated clusters

Securing HPC on AWS – isolated clusters

In this post, we’ll share two ways customers can operate HPC workloads using AWS ParallelCluster while completely isolated from the Internet. ParallelCluster supports many different network configurations to support a range of uses. When referring to isolation we mean situations where your HPC cluster is completely self-contained inside AWS, or where you have a private […]

Large scale training with NeMo Megatron on AWS ParallelCluster using P5 instances

Large scale training with NVIDIA NeMo Megatron on AWS ParallelCluster using P5 instances

Launching distributed GPT training? See how AWS ParallelCluster sets up a fast shared filesystem, SSH keys, host files, and more between nodes. Our guide has the details for creating a Slurm-managed cluster to train NeMo Megatron at scale.