AWS HPC Blog

Category: AWS ParallelCluster

Large scale training with NeMo Megatron on AWS ParallelCluster using P5 instances

Large scale training with NVIDIA NeMo Megatron on AWS ParallelCluster using P5 instances

Launching distributed GPT training? See how AWS ParallelCluster sets up a fast shared filesystem, SSH keys, host files, and more between nodes. Our guide has the details for creating a Slurm-managed cluster to train NeMo Megatron at scale.

Best practices for running molecular dynamics simulations on AWS Graviton3E

Best practices for running molecular dynamics simulations on AWS Graviton3E

If you run molecular dynamics simulations, you need to read this. We walk through running benchmarks of popular apps like GROMACS and LAMMPS on new Hpc7g instances and Graviton3E processors. The results – up to 35% better vector performance versus Graviton3! Learn how to optimize your own workflows.

Protein language model training with NVIDIA BioNeMo framework on AWS ParallelCluster

Protein language model training with NVIDIA BioNeMo framework on AWS ParallelCluster

In this new post, we discuss pre-training ESM-1nv for protein language modeling with NVIDIA BioNeMo on AWS. Learn how you can efficiently deploy and customize generative models like ESM-1nv on GPU clusters with ParallelCluster. Whether you’re studying protein sequences, predicting properties, or discovering new therapeutics, this post has tips to accelerate your protein AI workloads on the cloud.

Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML

Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML

No more guessing if GPU capacity will be available when you launch ML jobs! EC2 Capacity Blocks for ML let you lock in GPU reservations so you can start tasks on time. Learn how to integrate Caacity Blocks into AWS ParallelCluster to optimize your workflow in our latest technical blog post.