AWS HPC Blog

Tag: Slurm

How Aionics accelerates chemical formulation and discovery with AWS Parallel Computing Service

This post was contributed by Mohamed K. Elshazly, PhD, Kareem Abdol-Hamid, Sam Bydlon, PhD, Aarabhi Achanta, and Mark Azadpour The decarbonization of our modern economy depends on solving a defining scientific challenge: developing batteries that are both safe and high performing. From electrical grids to vehicles and aviation, these energy storage devices must provide power […]

Securing HPC on AWS – isolated clusters

Securing HPC on AWS – isolated clusters

In this post, we’ll share two ways customers can operate HPC workloads using AWS ParallelCluster while completely isolated from the Internet. ParallelCluster supports many different network configurations to support a range of uses. When referring to isolation we mean situations where your HPC cluster is completely self-contained inside AWS, or where you have a private […]

Electronic design at the speed of Lightmatter: transforming EDA workloads with RES

Electronic design at the speed of Lightmatter: transforming EDA workloads with RES

Check out this post to learn how Lightmatter leveraged AWS tools like Research and Engineering Studio (RES) and AWS ParallelCluster to meet demanding computational requirements for electronic design.

Implementing e-mail and SMS notifications in AWS ParallelCluster with Slurm

Implementing e-mail and SMS notifications in AWS ParallelCluster with Slurm

Learn how to configure email and SMS alerts for job events to stay on top of your HPC workloads with AWS ParallelCluster using Slurm.

Improve HPC workloads on AWS for environmental sustainability

Improve HPC workloads on AWS for environmental sustainability

Need to cut your carbon footprint without sacrificing productivity? Migrating HPC workloads to the cloud allowed Baker Hughes to reduce emissions by 99%! Get tips for optimizing compute, storage, networking so you can do better.

Call for participation- HPC tutorial series from the HPCIC

Call for participation: HPC tutorial series from the HPCIC

Interested in getting hands-on experience with cutting-edge HPC tools? Check out this blog post on an upcoming virtual training series from @LLNL and @AWSCloud. Learn emerging technologies from the experts this August.

Large scale training with NeMo Megatron on AWS ParallelCluster using P5 instances

Large scale training with NVIDIA NeMo Megatron on AWS ParallelCluster using P5 instances

Launching distributed GPT training? See how AWS ParallelCluster sets up a fast shared filesystem, SSH keys, host files, and more between nodes. Our guide has the details for creating a Slurm-managed cluster to train NeMo Megatron at scale.

Build and deploy a 1 TB/s file system in under an hour

Build and deploy a 1 TB/s file system in under an hour

Want to set up a high-speed shared file system for your #HPC or #AI workloads in under an hour? Learn how with this new blog post.

Dynamic HPC budget control using a core-limit approach with AWS ParallelCluster

Dynamic HPC budget control using a core-limit approach with AWS ParallelCluster

Balancing fixed budgets with fluctuating HPC needs is challenging. Discover a customizable solution for automatically setting weekly resource limits based on previous spending.

Slurm REST API in AWS ParallelCluster

Slurm REST API in AWS ParallelCluster

Looking to integrate AWS ParallelCluster into an automated workflow? This post shows how to submit and monitor jobs programmatically with Slurm REST API (code examples included).

← Older posts