Slurm-based memory-aware scheduling in AWS ParallelCluster 3.2
Olly Perks, Snr Dev Advocate for HPC, HPC Engineering
Austin Cherian, Snr Product Manager for HPC
With the release of AWS ParallelCluster version 3.2 we’re now supporting new scheduling capabilities based on memory requirements for your HPC jobs. ParallelCluster now supports Memory-aware scheduling in Slurm to give you control over the placement of jobs with specific memory requirements.
Resource utilization patterns of HPC workloads vary a lot depending on factors like your domain, industry and sector. For some workloads, memory capacity is more critical than CPU core-count – both from a functional and performance perspective.
Optimizing around memory requirements can also lead to better efficiencies, which can result in cost savings. A frequently-adopted technique is to co-locate jobs onto a single instance to consume not just the total core count but the total usable memory the instance can provide. However, optimizing to pack jobs into a single instance deterministically becomes problematic when jobs have large memory requirements and you can’t be sure they’ll fit. By default, Slurm will schedule jobs based on CPU requirements – making it difficult to efficiently co-locate memory intensive jobs.
Memory-aware scheduling means you define the memory requirements of your job, and let Slurm schedule accordingly. This extra information allows Slurm to co-locate memory intensive jobs – whilst protecting the requested memory capacity.
Prior to this feature, the task scheduling for memory-intensive jobs was a combination of estimation and guess-work. The consequence of getting it wrong was an increased risk of jobs running out of memory and getting aborted.
With the memory-aware scheduling feature you now have the ability to define the memory requirements of your jobs within Slurm. Slurm then constrains scheduling to only those Amazon Elastic Compute Cloud (Amazon EC2) instances which can guarantee meeting the memory requirement you specified.
In ParallelCluster 3.2, Slurm is now aware of the memory capacity of each AWS instance type. By default, the memory-aware scheduling feature is disabled to ensure backwards compatibility. You can enable it via the ParallelCluster configuration file, and then make use of Slurm’s
--mem-per-cpu job submission flags.
You can read more about the usage of this feature in the official AWS ParallelCluster User Guide.
Enabling Memory-Aware Scheduling
To enable memory-aware scheduling you must first update your ParallelCluster version to the latest 3.2 release. You must include the new EnableMemoryBasedScheduling attribute to the SlurmSettings section under the Scheduling section of the cluster configuration file and set it to true. Here’s a snippet of the configuration parameters to enable memory-aware scheduling:
Scheduling: Scheduler: slurm SlurmSettings: EnableMemoryBasedScheduling: true # Default is: false
Creating a cluster with this configuration enabled allows Slurm to be aware of the available memory on the Amazon EC2 compute nodes. You can now submit jobs using Slurm’s memory wityh the command:
$ sbatch -N 1 -n 4 -c 1 --mem-per-cpu=4GB job.sh
This command submits a job with Slurm asking for 1 node with 4 tasks per node and 1 CPU per task. This line also sets a memory requirement of 4 GB per CPU, meaning this job will require 16 GB of memory. Slurm can now schedule it to any node with over 16 GB of free memory, and 4 free CPUs.
Since the exact memory on an Amazon EC2 instance can vary depending on the memory consumption of the OS and other processes, Slurm will only be allowed to “see” 95% of the specified memory for an EC2 instance. If you’re using extremely large memory instances (e.g., instances with terabytes of RAM), the remaining 5% of memory can lead to hundreds for gigabytes of memory being unused. You can avoid that kind of wastage by tweaking the available memory of an instance type in a specific compute resource. To do that, you set the SchedulableMemory parameter under the ComputeResource settings to an amount of memory based on your estimate. Slurm will now ‘see’ the specific amount of memory you defined, rather than the default 95% of what’s specified for the instance type. Here’s a snippet of the AWS ParallelCluster configuration parameters to enable memory-aware scheduling and specify the Schedulable Memory of a compute resource in queue:
Scheduling: Scheduler: slurm SlurmSettings: EnableMemoryBasedScheduling: true # Default is: false SlurmQueues: - Name: <queue_name> ComputeResources: - Name: <compute_resource_name> ... SchedulableMemory: <amount in MiB> # Default is: 95% of memory advertised by the EC2 instance
Getting started with memory-aware scheduling
With memory-aware scheduling you are now able to submit jobs based on their memory requirements. This enables more flexible co-location of jobs, while also protecting memory capacity.
To use memory-aware scheduling you must first update to the latest ParallelCluster version – this upgrade guide documents that process.
To enable the feature, you then need to modify the cluster configuration file, with the new memory-aware scheduling attributes. The AWS ParallelCluster User Guide details this procedure and gives more information about these attributes.
Once you’ve configured your cluster you can start experimenting with memory-aware scheduling. Check out the Slurm user documentation for more details on the scheduling behavior, and specifics of the job submission flags.
Memory-aware scheduling isn’t the only new feature in AWS ParallelCluster 3.2 – there are new filesystem types supported and ability to have multiple filesystem mounts. Keep following the HPC Blog Channel to keep up to date on these.