AWS HPC Blog

Expanded filesystems support in AWS ParallelCluster 3.2

Olly Perks, Snr Dev Advocate for HPC, HPC Engineering
Austin Cherian, Snr Product Manager for HPC

Data is critical to HPC, and ensuring your simulations have the data they need — when they need it — is essential. However, data can originate from many sources and need to be consumed by diverse resources. Having the flexibility to add more and different types of storage options to your cluster makes these data more readily available for your jobs.

Since launch, Amazon FSx has been aiming to provide you more options to launch, run, and scale feature rich and cost-effective storage – powered by your choice of filesystems. AWS ParallelCluster helps by enabling integration with these recent filesystem choices giving you the same flexibility so you can better architect your HPC storage.

AWS ParallelCluster version 3.2 introduces support for two new Amazon FSx filesystem types (Amazon FSx for NetApp ONTAP and Amazon FSx for OpenZFS). It also lifts the limit on the number of Amazon FSx and Amazon EFS filesystem mounts you can have on your cluster.

By increasing the options for filesystem access, your HPC workloads on AWS will have more pathways to get access to the data they need without you having to do the hard work. In today’s post, we’ll explain this in detail.

What’s New?

New filesystems

ParallelCluster already has support for Amazon Elastic File System (EFS), Amazon Elastic Block Store (EBS) and Amazon FSx for Lustre. In this release we added support for the FSx for NetApp ONTAP and FSx for OpenZFS filesystems.

Different filesystem types have specific characteristics making them more suited to different data types and workflows. For example, Amazon FSx for OpenZFS is a simple and powerful shared file storage based on OpenZFS and delivers ultra-high speed at low cost. You’ve probably been using OpenZFS for its efficiency and performance features like copy-on-write that enables instant snap-shots, integrated data resiliency, and its adaptive replacement cache – all built into the filesystem. You now have the choice to use the filesystem that is most appropriate for your needs, without worrying about incompatibility in ParallelCluster.

More filesystems

Prior to this release, ParallelCluster could only support the mounting of one of each file system types (e.g. one EFS mount and one FSx for Lustre). That required you to consolidate your data storage, forcing you to do more planning of your overall HPC storage configurations, due to these limited attach points.

With this ParallelCluster 3.2, you can now mount up to 20 Amazon FSx file systems and up to 20 Amazon EFS filesystems. These mounts are for existing filesystems, where your data already exists. They are not managed by ParallelCluster, so there is no data movement required. This also means they persist when you delete your cluster, allowing you with more control in decoupling the cluster infrastructure from the data. The User Guides for each filesystem type document best practices for creation and management (FSx for Lustre, FSx for NetApp ONTAP and FSx for OpenZFS).

Together these two features add a significant level of flexibility for storage solutions within ParallelCluster.

Using these new filesystems

The cluster configuration YAML syntax for multi-filesystem mounts hasn’t changed with this new release – you can simply specify more entries in the SharedStorage section of the configuration file. ParallelCluster will create the cluster with the specified storage mounted and ready to use. Here is an example of the storage configuration with the newly added entries for FSx for NetApp ONTAP and FSx for OpenZFS:

SharedStorage:
  - MountDir: /shared/ebs1
    Name: ebs1
    StorageType: Ebs
    EbsSettings:
	...
  - MountDir: /shared/efs10
    Name: efs10
    StorageType: Efs
    EfsSettings:
   	...
  - MountDir: /lustre1
    Name: LustreData1
    StorageType: FsxLustre
    FsxLustreSettings:
      FileSystemId: fs-01111111111111111
  - MountDir: /lustre2
    Name: LustreData2
    StorageType: FsxLustre
    FsxLustreSettings:
      FileSystemId: fs-02222222222222222
  - MountDir: /netappontap
    Name: NetApp
    StorageType: FsxOntap
    FsxOntapSettings:
      VolumeId: fsvol-01111111111111111
  - MountDir: /openzfs
    Name: OpenZFS
    StorageType: FsxOpenZfs
    FsxOpenZfsSettings:
      VolumeId: fsvol-02222222222222222

In this example we show six filesystem mounts: two FSx for Lustre, one FSx for NetApp ONTAP, one for FSx for OpenZFS, another for EBS and, finally, one for EFS.

FSx for Lustre takes the FileSystemId because we’re mounting the whole filesystem. Both FSx for NetApp ONTAP and FSx for OpenZFS use VolumeId parameters because we are only mounting a specific volume. This is subtle, so keep an eye on it while you’re configuring.

There’s more information on this syntax, and documentation for these features, in the AWS ParallelCluster User Guide.

Conclusion

With ParallelCluster 3.2 you can now mount FSx for NetApp ONTAP and FSx for OpenZFS filesystems in addition to existing filesystem types. Additionally, you can mount up to 20 Amazon FSx file systems and up to 20 Amazon EFS filesystems, which really extends the level of flexibility for storage in ParallelCluster.

To make use of these new filesystem support features you must first update to the latest ParallelCluster version – this upgrade guide shows you how to do that.

Since these filesystems are not managed by ParallelCluster you must first create the filesystems. For details on how to do that, check the specific filesystem type documentation (FSx for Lustre, FSx for NetApp ONTAP and FSx for OpenZFS). You can update your cluster configuration with the new mount information. The ParallelCluster User Guide provides more information on syntax and usage.

Filesystem enhancements are not the only new features in ParallelCluster 3.2 – you can learn about new memory-aware scheduling features in our other blog post.

Oliver Perks

Oliver Perks

Oliver is a Senior Developer Advocate for HPC at AWS. Prior to AWS, Olly has a decade of experience within the HPC community, and was most recently an HPC applications engineer at Arm, enabling customers across the globe to build applications for Arm-based architectures.

Austin Cherian

Austin Cherian

Austin is a Senior Product Manager-Technical for High Performance Computing at AWS. Previously, he was a Snr Developer Advocate for HPC & Batch, based in Singapore. He's responsible for ensuring AWS ParallelCluster grows to ensure a smooth journey for customers deploying their HPC workloads on AWS. Prior to AWS, Austin was the Head of Intel’s HPC & AI business for India where he led the team that helped customers with a path to High Performance Computing on Intel architectures.