AWS HPC Blog
Support for Instance Allocation Flexibility in AWS ParallelCluster 3.3
With AWS ParallelCluster, you can build an autoscaling HPC cluster that adapts its size and cost to the amount of work you have to do.
ParallelCluster accomplishes this by monitoring how many Amazon EC2 virtual CPUs (vCPU) are needed to run the pending jobs in a Slurm queue. If that number is higher than a configurable threshold, the Slurm scheduler adds more EC2 instances to the cluster and makes them available to run jobs. In previous versions of ParallelCluster, these new instances all had to launch from the same compute resource, and compute resources could only use one instance type. If there were not enough instances of the requested type available in the availability zone mapped to the cluster, ParallelCluster could switch over to another compute resource and try the instance type configured there. It was not, however, possible to launch instances from multiple compute resources at the same time.
Today we’re announcing “multiple instance type allocation” in ParallelCluster 3.3.0. This feature enables you to specify multiple instance types to use when scaling up the compute resources for a Slurm job queue. Your HPC workloads will have more paths forward to get access to the EC2 capacity they need, helping you to get more computing done.
This post explains in detail how the new feature works and how to configure your cluster to use it.
What’s New
In previous versions of ParallelCluster, you defined one or more Slurm queues. You configured each queue with one or more compute resources, and each compute resource could be configured with one EC2 instance type that would be used to launch a collection of instances (Figure 1A). But, only one compute resource – and as a consequence, one instance type could be available for instance launches at any given time.
However, there are many EC2 instance types that are close enough in architecture that they could be used interchangeably. They have the same vCPU count, processor architecture, accelerator, network capability, and so on. Customers asked to be able to combine those instance types in a single Slurm compute resource which is what ParallelCluster 3.3.0 now enables. Specifically, you can provide a list of instance types (Figure 1B) and set an allocation strategy (Figure 1C) to optimize the cost and total time to solution of your HPC jobs.
Using Multiple Instance Type Allocation with Slurm
To take advantage of this new capability, you’ll need ParallelCluster to 3.3.0. You can follow this online guide to help you upgrade. Next, edit your cluster configuration as described below. Finally, create a cluster using the new configuration.
Configuring Your Cluster
The schema for the ParallelCluster configuration file has changed to support flexible instance types. In this example, we define a Slurm queue called flex_od
powered by an Amazon EC2 on-demand instance. It has a single compute resource named cra
. Note how this differs from earlier versions of the ParallelCluster configuration file. Rather than defining a single InstanceType
, we now have a parameter named Instances. It contains three InstanceType
entries, one for each instance type we want to use.
...
Scheduling:
Scheduler: slurm
SlurmQueues:
- Name: flex_od
CapacityType: ONDEMAND
ComputeResources:
- Name: cra
Instances:
- InstanceType: c6a.24xlarge
- InstanceType: m6a.24xlarge
- InstanceType: r6a.24xlarge
MinCount: 0
MaxCount: 100
Networking:
SubnetIds:
- subnet-0123456789
...
Our new configuration tells ParallelCluster to launch up to 100 total c6a.24xlarge
, m6a.24xlarge
, and r6a.24xlarge
on-demand instances to process jobs in the flex_od
queue. You can switch to using Spot Instances by changing the CapacityType
to SPOT
. Also, you can still have multiple compute resources for each queue if you have a use case that requires that configuration.
Selecting Instance Types
There are some rules that define which instance types can be combined in a compute resource. First, the CPUs must all be the same broad architecture (e.g. x86), but they can be from different manufacturers (such as Intel and AMD). They must have the same number of vCPUs, or, if CPU hyperthreading is disabled, they must have the same number of physical cores. Next, if the instances have accelerators, they must have the same number and be from the same manufacturer. Finally, if EFA is enabled for the queue, all instance types must support EFA.
You can find instance types with matching criteria by searching in the AWS EC2 Console under Instance Types (Figure 2).
You can also use the AWS CLI with a search filter. Here’s an example to find all instance types with 64 x x86 vCPUs, along with example output (Figure 3).
aws ec2 describe-instance-types --region REGION-ID \
--filters "Name=vcpu-info.default-vcpus,Values=64" "Name=processor-info.supported-architecture,Values=x86_64" \
--query "sort_by(InstanceTypes[*].{InstanceType:InstanceType,MemoryMiB:MemoryInfo.SizeInMiB,CurrentGeneration:CurrentGeneration,VCpus:VCpuInfo.DefaultVCpus,Cores:VCpuInfo.DefaultCores,Architecture:ProcessorInfo.SupportedArchitectures[0],MaxNetworkCards:NetworkInfo.MaximumNetworkCards,EfaSupported:NetworkInfo.EfaSupported,GpuCount:GpuInfo.Gpus[0].Count,GpuManufacturer:GpuInfo.Gpus[0].Manufacturer}, &InstanceType)" \
--output table
Choosing an Allocation Strategy
By default, ParallelCluster will optimize for cost by launching the least expensive instances first.
However, when you are using Spot instances, you will probably want to optimize the chances that your jobs will run to completion instead of being interrupted. This is especially the case for workloads where it may be quite expensive to checkpoint and re-start work in progress. You can configure a ParallelCluster queue with this optimization by adding an AllocationStategy
key to the queue and setting it to capacity-optimized
, rather than its default value of lowest-price
. Here’s an example:
...
Scheduling:
Scheduler: slurm
SlurmQueues:
- Name: flex_spot
CapacityType: SPOT
AllocationStrategy: capacity-optimized
ComputeResources:
- Name: cra
Instances:
- InstanceType: c6i.xlarge
- InstanceType: m6i.xlarge
- InstanceType: r6i.xlarge
MinCount: 0
MaxCount: 100
Networking:
SubnetIds:
- subnet-0123456789
...
Under this configuration, EC2 Fleet will look at real-time Spot capacity and launch instances into pools that are the most available. This offers the possibility of fewer interruptions to running HPC jobs. Depending on the cost of interruptions, this strategy, which may not always launch the least expensive instances, may still reduce the total cost or runtime of your jobs.
It’s also possible to change the AllocationStrategy
dynamically without having to stop and restart the compute fleet. This can help you respond to shifting Spot availability conditions. To accomplish this, change the Slurm queue update strategy to either DRAIN
or TERMINATE
. This will set nodes affected by the changed configuration to the DRAINING
state — they won’t accept new jobs but will continue running any jobs already in process. TERMINATE
, on the other hand, would immediately stop any running jobs on the affected nodes. You can consult the pcluster update
documentation for more detail on cluster updates using this setting. Here’s an example of setting it to DRAIN
:
Scheduling:
Scheduler: slurm
SlurmSettings:
QueueUpdateStrategy: DRAIN
Conclusion
Using ParallelCluster 3.3 you can dynamically specify a list of EC2 instance types, and optionally, an allocation strategy, to define aggregate capacity for your Slurm job scheduler queues. This gives you newfound flexibility in how you assemble compute capacity for your HPC workloads. You’ll need to update your ParallelCluster installation and configuration to use this new capability.
We’d love to know what you think after trying out ParallelCluster multiple instance type allocation, and how we can improve it.