Introducing AWS ParallelCluster multiuser support via Active Directory

Using AWS ParallelCluster to set up your own exclusive HPC environment is extremely worthwhile, but there are situations where you need to share the cluster across multiple users as a common resource to ease administration or to enable greater collaboration. For example, a principal investigator in a lab might need to centrally administer all the resources used by their research group for a project covered by a grant. Engineering groups often require design and numerical simulation teams to work together on workflows that are just easier in a single HPC environment.

Today we’re announcing the release of ParallelCluster 3.1 which now supports multiuser authentication based on Active Directory (AD). Starting with v3.1.1 clusters can be configured to use an AD domain managed via one of the AWS Directory Service options like Simple AD or AWS Managed Microsoft AD (MSAD).

From Solution to Feature

This is not the first time we’ve supported multiuser authentication with AWS ParallelCluster. In the past we provided solutions for both LDAP and Active Directory-based user authentication enabled by automation scripts, AWS CloudFormation and AWS Lambda.

With this new feature, we’re introducing AD authentication as a configurable feature in ParallelCluster that works with your existing or new AWS-based AD environments, deployed using AWS Directory Services.

To enable this feature, we’re introducing a new DirectoryService section in the ParallelCluster configuration file. Parameters from this section are used to create an HPC cluster that can authenticate AD users in addition to the ec2-user. The following configuration sample shows the parameter set for the DirectoryService section.

  DomainName: dc=hpc,dc=example,dc=org
  PasswordSecretArn: arn:aws:secretsmanager:us-east-1:<account-id>:secret:Secret-Containing-ReadOnlyUser-Password
  DomainReadOnlyUser: cn=ReadOnlyUser,ou=Users,dc=hpc,dc=example,dc=org
  GenerateSshKeysForUsers: true
  LdapTlsCaCert: /full/path/to/CA/certificate/bundle
  LdapTlsReqCert: hard
  LdapAccessFilter: memberOf=cn=pcuser,ou=Users,ou=CORP,dc=corp,dc=pcluster,dc=com
    debug_level: "0x1ff"

There are some basic configuration parameters in the DirectoryService section to enable the head node of the cluster to successfully interact with an AD domain controller. To begin with we need the domain name and domain controller address specified by the DomainName and the DomainAddr parameters. The configuration also requires specifying the X.500 directory specification (cn, ou, dc) of a read-only user in your directory as a value to the DomainReadOnlyUser parameter.  The value of this parameter differs depending on AD service being used (i.e. Simple AD vs MSAD). This user’s password must be stored as a secret using AWS Secrets Manager and the Amazon Resource Names (ARN) is provided to ParallelCluster as a value to the PasswordSecretArn parameter.

The creation of the read-only user and storing its password in Secrets Manager is a pre-requisite step to cluster creation. When you deploy a cluster, it will use this user credential to authenticate to the AD domain controller. For further details around configurations of the read-only user, please refer to the multiuser AD tutorial and feature user guide, which are part of AWS ParallelCluster official documentation.

Apart from these basic parameters, there’s some additional configuration to extend the capabilities of the head node when authenticating AD users. Let’s have a look at them:

  • Password authentication for authenticated AD users is not enabled for compute nodes. If your workload needs password-less SSH to be enabled between cluster nodes, you’ll need to set the GenerateSshKeysForUsers parameter to true.
  • Any System Security Services Daemon (SSSD) parameters (which need be written into the SSSD config files of cluster instances) can be passed using the AdditionalSssdConfigs parameter that takes key-value pairs containing arbitrary SSSD parameters. The example shown in Figure 1 has the SSSD debug_level parameter set to very verbose debugging for SSSD to debug any potential issues.
  • Certificates required for LDAP TLS sessions (LDAPS) are pointed to by the LdapTlsCaCert – this parameter describes the path to a certificate bundle that contains the certificates of the entire CA chain that issued certificates for the domain controllers the head node is connecting. Further checks on the server certificate are enforced by providing appropriate values to the LdapTlsReqCert

The LdapAccessFilter parameter specifies an LDAP filter to limit LDAP queries to a subset of the directory that is being queried (this corresponds to sssd-ldap parameter ldap_access_filter). Using the example in Figure 1 again, a filter could be setup to limit queries to users who are part of a specific AD user group authorized to login to the head node.

How it works

The previous section may have provided you an intuitive idea about how ParallelCluster is able to set up the authentication mechanism. But let’s have a look at the flow of the authentication to make things a bit clearer.

When you create a cluster using the configuration settings under DirectoryServices, then during cluster creation there will be a one-time step that involves retrieving the read-only user credential (stored in the AWS Secrets Manager) and writing it to the sssd.conf file on the head node. This lets the head node authenticate itself to the AD domain controller, which in turn allows the head node to run queries on the directory service for other users trying to login.

When users try to login to the head node via SSH (to start jobs), the head node will authenticate them against the directory service and allow them to login if they’re successfully authenticated. They’ll be able to interactively run or submit jobs to the queues on the cluster.

Figure 1: The Active Directory user authentication process.

Figure 1: The Active Directory user authentication process.

Get started with AD based multiuser authentication

The AD-based multiuser authentication feature will make it much easier to share secure HPC resources within teams and between groups in virtually any enterprise environment managing its authentication using Active Directory. Groups and teams can now set up a cluster as a common resource and share tools, techniques, and data quickly, boosting everyone’s productivity.

To use the feature, you should have an existing Active Directory setup or create one for use using any one of the AWS Directory Service options. To know more about creating an Active Directory setup, check out the Active Directory Domain Services on AWS Quick Start guide.

For a quick test drive of the feature, including quickly setting up an AD system to test, have a look at the multiuser Active Directory tutorial and feature user guide which are part of AWS ParallelCluster official documentation. And let us know what you create.

Austin Cherian

Austin Cherian

Austin is a Senior Product Manager-Technical for High Performance Computing at AWS. Previously, he was a Snr Developer Advocate for HPC & Batch, based in Singapore. He's responsible for ensuring AWS ParallelCluster grows to ensure a smooth journey for customers deploying their HPC workloads on AWS. Prior to AWS, Austin was the Head of Intel’s HPC & AI business for India where he led the team that helped customers with a path to High Performance Computing on Intel architectures.