This year, we’ve launched a lot of new capabilities for HPC customers, making AWS the best place for the length and breadth of their workflows. EFA went mainstream and is now available in sixteen instance families for fast fabric capabilities for scaling MPI and NCCL codes. We’ve written deep-dive studies to explore and explain the optimizations that will drive your workloads faster in the cloud than elsewhere. We released a major new version of AWS ParallelCluster with its own API for controlling the cluster lifecycle. AWS Batch became deeply integrated into AWS Step Functions and now supports fair-share scheduling, with multiple levers to control the experience. Today we’re signaling the arrival of a new HPC-dedicated instance family – the Hpc6a – and an enhanced EnginFrame that will bring the best of the cloud and on-premises together in a single interface.
HPC systems are traditionally access through a Command Line Interface (CLI) where the users submit and manage their computational jobs. Depending on their experience and sophistication, the CLI can be a daunting experience for users not accustomed in using it. Fortunately, the cloud offers many other options for users to submit and manage their computational jobs. In this blog post we will cover how to create a serverless API to interact with an HPC system in the the cloud built with AWS ParallelCluster.
The Slurm Workload Manager by SchedMD is a popular HPC scheduler and is supported by AWS ParallelCluster, an elastic HPC cluster management service offered by AWS. Traditional HPC workflows involve logging into a head node and running shell commands to submit jobs to a scheduler and check job status. Modern distributed systems often use representational […]
In September, we announced the release of AWS ParallelCluster 3, a major release with lots of changes and new features. To help get you started migrating your clusters, we provided the Moving from AWS ParallelCluster 2.x to 3.x guide. We know moving versions can be a quite an undertaking, so we’re augmenting that official documentation with additional color and context on a few key areas. With this blog post, we’ll focus on the configuration file format changes for ParallelCluster 3, and how they map back to the same configuration sections for ParallelCluster 2.
Many shared file systems are used in supporting read-intensive applications, like financial backtesting. These applications typically exploit copies of datasets whose authoritative copy resides somewhere else. For small datasets, in-memory databases and caching techniques can yield impressive results. However, low latency flash-based scalable shared file systems can provide both massive IOPs and bandwidth. They’re also easy to adopt because of their use of a file-level abstraction. In this post, I’ll share how to easily create and scale a shared, distributed POSIX compatible file system that performs at local NVMe speeds for files opened read-only.
In this blog post, we’ll describe an ensemble run of 20K simulations to accelerate the drug discovery process, while also optimizing for run time and cost. We used two popular open-source packages — GROMACS, which does a molecular dynamics simulations, and pmx, a free-energy calculation package from the Computational Biomolecular Dynamics Group at Max Planck Institute in Germany.
The Financial Services industry makes significant use of high performance computing (HPC) but it tends to be in the form of loosely coupled, embarrassingly parallel workloads to support risk modelling. The infrastructure tends to scale out to meet ever increasing demand as the analyses look at more and finer grained data. At AWS we’ve helped many customers tackle scaling challenges are noticing some common themes. In this post we describe how HPC teams are thinking about how they deliver compute capacity today, and highlight how we see the solutions converging for the future.
Recently, we talked about the advances NICE DCV has made to push pixels from cloud-hosted desktops or applications over the internet even more efficiently than before. Since we published that post on this blog channel, we’ve been asked by several customers whether all this efficient pixel-pushing could lead to outbound data charges moving up on their AWS bill. We decided to try it on your behalf, and share the details with you in this post. The bottom line? The charges are unlikely to be significant unless you’re doing intensive streaming (such as gaming) and other cost optimizations (like AWS Instance Savings Plans) that will have more impact on your bill.
This three-part series of posts cover the price performance characteristics of running GROMACS on Amazon Elastic Compute Cloud (Amazon EC2) GPU instances. Part 1 covered some background no GROMACS and how it utilizes GPUs for acceleration. Part 2 covered the price performance of GROMACS on a particular GPU instance family running on a single instance. […]
This three-part series of posts cover the price performance characteristics of running GROMACS on Amazon Elastic Compute Cloud (Amazon EC2) GPU instances. Part 1 covered some background no GROMACS and how it utilizes GPUs for acceleration. This post (Part 2) covers the price performance of GROMACS on a particular GPU instance family running on a […]