AWS HPC Blog

Using the ParallelCluster 3 Configuration Converter

In September of 2021, we announced the release of AWS ParallelCluster 3, a major release with several changes and a lot of new features. To help get you started migrating your clusters, we provided the Moving from AWS ParallelCluster 2.x to 3.x guide. One of the key changes is that the configuration is now expressed using YAML instead of the INI syntax. Migrating from ParallelCluster version 2 to version 3 will require changing your configuration file to adapt to the new syntax.

To help with this, we’ve created a config converter tool which is part of the ParallelCluster (>= v3.0.1) command line interface (CLI).

This post provides you with an overview of the tool to get started.

Availability & Usage

This config converter tool is available in the standard executable path once ParallelCluster is installed. It can be invoked using the pcluster3-config-converter command. The tool takes a ParallelCluster 2 configuration file as an input and outputs a ParallelCluster 3 configuration file.

The following command line provides an example of how to use the tool to convert a ParallelCluster version 2 configuration file to a ParallelCluster 3 configuration file:

pcluster3-config-converter \
    --config-file <ParallelCluster 2 config file> \
    --output-file <ParallelCluster 3 config file>    

The tool manages transforming the parameter specifications taking into consideration the functional feature differences between ParallelCluster 2 and ParallelCluster 3. It provides verbose messages to indicate these differences with messages that are informational, warnings or errors.

An example

The table below shows two configuration file samples where a ParallelCluster 2 configuration file has been converted to a ParallelCluster 3 configuration file by the configuration converter tool.

ParallelCluster version 2 ParallelCluster version
[aws]
aws_region_name = us-east-1

[global]
cluster_template = mycluster

[vpc public]
vpc_id = vpc-864dddfb
master_subnet_id = subnet-cfcc8eee

[cluster mycluster]
key_name = AC_HPCDR_USNV
base_os = alinux2
scheduler = slurm
master_instance_type = c5n.18xlarge
vpc_settings = public
queue_settings = ondemand

[queue ondemand]
compute_resource_settings = ondemand_i1

[compute_resource ondemand_i1]
instance_type = c5.2xlarge
MinCount: 0
MaxCount: 64
Region: us-east-1
Image:
  Os: alinux2
HeadNode:
  InstanceType: c5.4xlarge
  Networking:
    SubnetId: subnet-cfcc8eee
  Ssh:
    KeyName: AC_HPCDR_USNV
Scheduling:
  Scheduler: slurm
  SlurmSettings:
    ScaledownIdletime: 10
    Dns:
      DisableManagedDns: true
  SlurmQueues:
    - Name: ondemand
      ComputeSettings:
        LocalStorage:
          RootVolume:
            Size: 100
      CapacityType: ONDEMAND
      ComputeResources:
        - Name: compute-resource-1
          InstanceType: c5.2xlarge
          MinCount: 0
          MaxCount: 64
          Networking:
             SubnetIds:
             - subnet-cfcc8eee

ParallelCluster 2 configuration files used pointers in the configuration file to point to various sections. The main section that defined the cluster components was the [cluster] section. This section contained configuration settings for the head node as well as pointers to the scheduler queue sections which further contained pointers to the compute resources.

One cluster at a time

Among that sea of pointers, ParallelCluster 2 allowed you to define multiple clusters inside one config file. In contrast, ParallelCluster 3 has a distinct HeadNode and a Scheduler section which only contain configurations that apply to the head node and the queues, and only for one cluster per config file.

The Scheduler queues section also includes the compute resources definition of those queues. While converting from version 2 to version 3, the tool needs to read the [cluster] section that you want translated to the new format for ParallelCluster version 3. You can direct the tool to reference the desired [cluster] section within the configuration file using the --cluster-template subcommand and specify the name of a cluster section. If you don’t use this subcommand the default behavior of the tool is to look for the cluster_template parameter in the [global] section or search for ‘[cluster default]’.

Shared Storage

You’ll notice in our configuration file example that there’s an additional SharedStorage section created by the configuration tool for version 3 of the configuration file. This section isn’t present in the Version 2 configuration file. This accommodates a functional difference between ParallelCluster 2 and 3 regarding default Shared EBS volumes.

In ParallelCluster 2 – a default EBS volume mounted at /shared was created if no other shared Amazon EBS volumes were specified. ParallelCluster 3 doesn’t define a default shared space, so you’ll need to explicitly define your shared storage configurations when you migrate. If no EBS volume is defined on the version 2 of the configuration file, the conversion tool will assume that there must be a default EBS volume on your currently deployed cluster. So, to maintain parity on final deployed configurations, it’ll add a SharedStorage section where it defines an EBS volume shared at /shared.

We recommend you review this addition before using the new version 3 configuration file for deploying a cluster.

Conclusion

Migrating an AWS ParallelCluster definition from version 2 to version 3 is a relatively straight forward process. The conceptual components remain the same with some changes in the organization of the configuration file. To reduce the burden of manually translating all parts of a version 2 config to version 3, we’ve introduced the config converter tool described here. Using this tool, you’ll be able to accelerate your migration by quickly creating the ParallelCluster version 3 configuration for your existing setup.

While the tool accurately converts to a ParallelCluster version 3 specification, make sure you review the new configuration file before using it to deploy a cluster.

For more information about AWS ParallelCluster 3 configuration specifications, check out the official documentation. You can also find more on the converter tool itself in the official documentation. We also have several videos explaining ParallelCluster in the HPC Tech Shorts channel.

Austin Cherian

Austin Cherian

Austin is a Senior Product Manager-Technical for High Performance Computing at AWS. Previously, he was a Snr Developer Advocate for HPC & Batch, based in Singapore. He's responsible for ensuring AWS ParallelCluster grows to ensure a smooth journey for customers deploying their HPC workloads on AWS. Prior to AWS, Austin was the Head of Intel’s HPC & AI business for India where he led the team that helped customers with a path to High Performance Computing on Intel architectures.