AWS HPC Blog
Tag: MPI
Deploying generative AI applications with NVIDIA NIMs on Amazon EKS
Learn how to deploy AI models at scale with @AWS using NVIDIA’s NIM and Amazon EKS! This step-by-step guide shows you how to create a GPU cluster for inference. Don’t miss part 1 of this 2-part blog series!
Optimizing MPI application performance on hpc7a by effectively using both EFA devices
Get the inside scoop on optimizing your MPI apps and configuration for AWS’s powerful new Hpc7a instances. Dual rail gives these instances huge networking potential @ 300 Gb/s – if properly used. This post provides benchmarks, sample configs, and real speedup numbers to help you maximize network performance. Whether you run weather simulations, CFD, or other HPC workloads, you’ll find practical tips for your codes.
EFA: how fixing one thing, led to an improvement for … everyone
Today, we’re diving deep into the open-source frameworks that move MPI messages around, and showing you how work we did in the Open MPI and libfabrics community lead to an improvement for EFA users – and everyone else, too.
Deep-dive into Hpc7a, the newest AMD-powered member of the HPC instance family
Today we discuss the performance results we saw from the new hpc7a instance, running HPC workloads like CFD, molecular dynamics, and weather prediction codes.
Instance sizes in the Amazon EC2 Hpc7 family – a different experience
Hpc7g is the first Amazon EC2 HPC instance offering with multiple instance sizes, but this is quite different from the experience of getting smaller instances from other non-HPC instance families. Today, we want to take a moment to explore why this is different, and how it helps.
Application deep-dive into the AWS Graviton3E-based Amazon EC2 Hpc7g instance
In this post we’ll show you application performance and scaling results from Hpc7g, a new instance powered by AWS Graviton3E across a wide range of HPC workloads and disciplines.
Checkpointing HPC applications using the Spot Instance two-minute notification from Amazon EC2
In this post we show you how to create an HPC cluster and capture the two-minute warning notifications from Amazon EC2 Spot to execute a checkpoint, reactively.
In the search for performance, there’s more than one way to build a network
AWS worked backwards from an essential problem in HPC networking (MPI ranks need to exchange lots of data quickly) and found a different solution for our unique circumstances, without trading off the things customers love the most about cloud: that you can run virtually any application, at scale, and right away. Find out more about how Elastic Fabric Adapter (EFA) can help your HPC workloads scale on AWS.