AWS Security Blog

How to use AWS Secrets Manager to securely store and rotate SSH key pairs

August 31, 2021: AWS KMS is replacing the term customer master key (CMK) with AWS KMS key and KMS key. The concept has not changed. To prevent breaking changes, AWS KMS is keeping some variations of this term. More info.

October 4, 2019: We’ve updated the estimated solution cost for accuracy.


AWS Secrets Manager provides full lifecycle management for secrets within your environment. In this post, Maitreya and I will show you how to use Secrets Manager to store, deliver, and rotate SSH keypairs used for communication within compute clusters. Rotation of these keypairs is a security best practice, and sometimes a regulatory requirement. Traditionally, these keypairs have been associated with a number of tough challenges. For example, synchronizing key rotation across all compute nodes, enable detailed logging and auditing, and manage access to users in order to modify secrets.

However, rotating the keypair on all compute clusters’ nodes must be done in a tightly coordinated fashion, and failures generally result in availability risks. Moreover, the keypairs themselves are highly sensitive security credentials which must be carefully controlled with fine-grain access controls, detailed monitoring, and audit logging. These are precisely the types of tough challenges that AWS Secrets Manger solves for you.

In this post, we’ll show you how to secure, rotate, and use SSH keypairs for inter-cluster communication. You’ll use an AWS CloudFormation template to launch a cluster and configure Secrets Manager. Then we’ll show you how to use Secrets Manager to deliver the keypair to the cluster and use it for management operations, such as securely copying a file between nodes. Finally, we’ll use Secrets Manager to seamlessly rotate the keypair used by the cluster without any changes or outages. In this post, we’ve highlighted compute clusters, but you can use Secrets Manager to apply this solution directly to any SSH based use-case.

Solution overview

The following architecture diagram presents an overview of the solution:
 

Figure 1: Solution architecture

Figure 1: Solution architecture

The sample architecture created by CloudFormation includes one master node, three worker nodes, AWS Secret Manager—which utilizes a rotation AWS Lambda function—and AWS Systems Manager. Setting up the cluster is out of scope for this post; in our walkthrough, we’ll focus on the keypair rotation architecture.

Secrets Manager uses staging labels to identify different versions of a secret during rotation. A staging label is a text string. For example, by default, AWSCURRENT is attached to the current version of the secret, while AWSPENDING will be attached to new versions of the secret before they have been verified and deployed to corresponding resources.

As shown in the diagram:

  1. A secret is created in AWS Secrets Manager. The secret holds the SSH keypair that the master node will use to connect to the other nodes in the cluster. Upon keypair rotation, Secrets Manager will invoke a Lambda function (labeled 1.a in the diagram). The Lambda function will perform four steps:
    • 1.b: createSecret – create a new SSH keypair and store the private key as a new version of the secret.
    • 1.c: setSecret – label the newly created secret version with the label AWSPENDING and copy the public key to the worker nodes with AWS Systems Manager Run Command.

    The Lambda function will also perform two steps not shown in the diagram:

    • testSecret – verify that the new SSH keypair has been successfully deployed by invoking a test SSH connection.
    • finishSecret – set the staging label AWSCURRENT to the new secret version and remove the old keys from the worker nodes. This will also set the staging label AWSPREVIOUS to the old secret, allowing your administrator to have the ‘last known password’ if something goes wrong.

    An overview of the rotation Lambda function is available in the AWS Secrets Manager user guide. You have full control over the rotation function so that you can customize it to your needs. Note that no key is installed on the master node. Instead, the function will retrieve the private key from Secrets Manager only when it needs to securely communicate with the worker nodes. That private key is not saved on the master node’s filesystem but rather in volatile memory (per best practice, the private key variable is overwritten after successful authentication and deleted before the script exits); details about keeping secret data in volatile memory will follow later in this post.

  2. When the master node needs to communicate with any worker node, it will use an AWS SDK (Python Boto3) to read the SSH private key from Secrets Manager (2.a) and use the private key to establish an SSH tunnel with the worker nodes (2.b). The master node is authorized to read the private key from Secrets Manager because an AWS Identity and Access Management (IAM) role with a policy that allows it to access the secret is attached to the master node. The corresponding public key was deployed to each of the worker nodes during the rotation process in step one above.
  3. The secrets in Secrets Managers are encrypted with AWS Key Management System (KMS), and every version of the secret is encrypted with a unique data encryption key. The SSH key pair in the cluster will periodically rotate based on a configurable rotation interval, which you’ll configure from the Secrets Manager console later in this post. Each rotation repeats the process described in steps 1-2, resulting in a new version of the secret. Each new version will be encrypted using a new KMS data key, which provides an extra layer of security.
  4. The AWS Systems Manager Run Command will use the Amazon Elastic Compute Cloud (EC2) tag RotateSSHKeys with a value of True to identify the cluster’s worker node instances. Note that if you rely on tags as a security control, you must have clear governance and control over which users are able to change the tags and tag values on your EC2 instances.

Solution cost

Today, this solution deployed in the N. Virginia Region will cost $0.0914 an hour for the four t2.micro EC2 instances and NAT Gateway that comprise the sample cluster. Secrets manager has a 30-day trial period, after which one secret will cost $0.40 per month and $0.05 per 10,000 API calls. There is no additional charge for AWS Systems Manager.

Deploying the sample solution

In this section, you’ll deploy a test stack that demonstrates the entire solution. After deployment, you’ll log in to the master node and securely copy a file to one of the worker nodes. Finally, you’ll use Secrets Manager to rotate and deploy a new SSH keypair. The CloudFormation templates and secret rotation code are available in the AWS GitHub repository.

Set up the sample deployment by selecting the AWS CloudFormation Launch Stack button bellow; by default, the stack will be deployed in the us-east-1 (N. Virginia) Region.
 
Select this image to open a link that starts building the CloudFormation stack

The template creates an Amazon Amazon Virtual Private Cloud (VPC), private and public subnets, EC2 instances (master node and mock cluster), and the IAM role and policies used for the EC2 instances.

  1. Select your EC2 SSH key pair and input your IP range as stack parameters. In the YourIPRange field, enter the CIDR of your machine or network only, as this ensures only hosts from your network can access the master server. You may leave all other parameters as default. This CloudFormation template launches four t2.micro instances in a new VPC. One instance will be tagged as MasterServer and the rest will be tagged WorkerServer1-3.

    Note: The SSH keypair referenced here will be used to connect from your local computer to the master node. It is distinct from the SSH keypair used by the master node to connect to the worker nodes.

     

    Figure 2: Enter the CIDR of your machine or network

    Figure 2: Enter the CIDR of your machine or network

    Important: For simplicity, the master node you’ll create in this walkthrough will be in a public subnet, making it accessible from the CIDR you provided in Step 2. However, this is not the most secure approach possible. Follow the guidance in the Amazon EC2 VPC documentation to securely configure your cluster in a private subnet following the “defense in depth” principal.

  2. Monitor the status of the stack. When the status is CREATE_COMPLETE, the deployment is ready. Select the Outputs tab to find information about the newly created resources, and write down the master node’s public DNS and a worker node IP address. You’ll need both later in this post.
  3. Select the Launch Stack button to launch the AWS CloudFormation template that will deploy the Lambda function used by Secrets Manager, Accept the default values for the parameters. This template is designed for reusability; it can be applied to any SSH rotation use-case.
     
    Select this image to open a link that starts building the CloudFormation stack

Next, create and configure a new secret from the Secrets Manager console to store the cluster communication SSH keypair.

Configuring a secret in AWS Secrets Manager

The CloudFormation template did not deploy a secret, so follow these steps to create a secret from the console and rotation function configuration. To create a new secret:

  1. Open the AWS Secrets Manager console and select Store New Secret.
  2. Select Other type of secrets, then select the Plaintext tab.
  3. As shown in Figure 3, enter {} to create an empty JSON value with no properties. This value will be initially populated with a keypair by the rotation Lambda function.
     
    Figure 3: Create an empty JSON value with no properties

    Figure 3: Create an empty JSON value with no properties

  4. Keep the default encryption key and select Next. We’re keeping the default encryption key for the sake of simplicity in this example, but security best practices suggest using an AWS KMS key (KMS key) that you’ve created.
  5. In Step 2: Name and description, name the secret /dev/ssh. The path of a secret can be used in the secret’s IAM policy to restrict users and roles to a secret or hierarchy of secrets. For example, the IAM policy could include /dev/* or /prod/* to control access to secrets in development or production, respectively.
  6. Add a description, then select Next.
     
    Figure 4: Add a description

    Figure 4: Add a description

  7. In Step 3: Configure rotation, choose Enable automatic rotation and enable a rotation interval of your choice, which you can configure using the rotation interval dropdown list.
  8. Select the Choose an AWS Lambda function drop-down and choose RotateSSH. This is the Lambda function that was deployed by the CloudFormation template.
  9. Select Next, then review your configuration and select Store. When the new secrets configuration is stored, the rotation Lambda function is immediately invoked, populating the value of the secret.
     
    Figure 5: Configure the rotation

    Figure 5: Configure the rotation

Testing the sample solution

With the secret configuration completed and the instances up and running, you’re now going to securely copy a file from the master node to one of the worker nodes, using the SSH key stored in Secrets Manager to test the solution.

  1. Log in to the master node via SSH, using the EC2 key that you specified in the CloudFormation template.
  2. Once connected, securely copy a file from the master node to the worker node using SCP (secure copy protocol) by entering the command below. Replace <private-ip-of-worker> with the worker node IP you copied down in step 3:
    
                python copy_file.py ec2-user <private-ip-of-worker>
            

Figure 6 shows ssh login to master node, and the copy_file.py command to worker node.
 

Figure 6: The <span style="font-family: courier">ssh</span> login to master node, and the <span style="font-family: courier">copy_file.py</span> command

Figure 6: The ssh login to master node, and the copy_file.py command

During execution, the python script will use the Secrets Manager get_secret_value API to retrieve the secret, which includes the private key. It will then use this key to establish a secure SSH connection with the worker nodes, without saving the private key on any master node storage.

You can review the copy_file.py on the master node or on GitHub. In the get_private_key() function, you can read the secret value, which includes the private key:


    get_secret_value_response = client.get_secret_value(
    SecretId=secret_name)           

In the copy_file function, create a secured SSH tunnel to copy a file using the private key from memory, using Paramiko, a Python implementation of SSHv2.


    private_key_str = io.StringIO()
    # Write private key to a memory file
    private_key_str.write(private_key)
    
    # Create key object
    key = paramiko.RSAKey.from_private_key(private_key_str)
    
    # Open a channel and authenticate 
    trans = paramiko.Transport(ip, 22) 
    trans.start_client()
    trans.auth_publickey(user, key)
    del key        

To demonstrate the rotation of the SSH keypair, you’ll now manually invoke the rotation function:

  1. Return to the Secrets Manager console, select your /dev/ssh secret, and choose Retrieve Secret Value to see the key pair.
  2. Select Rotate secret immediately. In the pop-up window, confirm your choice by selecting Rotate.
     
    Figure 7: Set the "Secret value" and "Rotation configuration"

    Figure 7: Set the “Secret value” and “Rotation configuration”

  3. Choose Rotate again to complete the rotation.
     
    Figure 8: Select "Rotate"

    Figure 8: Select “Rotate”

  4. Select the Close button to refresh the view, and then choose Retrieve Secret Value again.
  5. Once the rotation has completed, you can inspect the new keypair via the Secrets Manager console. Go back to the terminal and run the same python script to copy a file using SCP. Replace <private-ip-of-worker> with your own worker node ID:
    
                    python copy_file.py ec2-user <private-ip-of-worker>
            

The file has now been transferred successfully using a new key pair, with no updates required.

Auditing and monitoring

You can monitor and audit all APIs used to create and rotate your keys in Secrets Manager via AWS CloudTrail. To view CloudTrail events, follow these steps:

  1. Open the CloudTrail console and select Event history.
  2. From the Filter dropdown field, select Event source, enter secret in the filter field, then select secretsmanager.amazonaws.com from the dropdown menu.
  3. From here, you can review Secrets Manager’s events, such as GetSecretValue, PutSecretValue, UpdateSecretVersionStage (which modifies the staging labels attached to a version of a secret), and RotationSucceeded, in the CloudTrail event history. These event logs help to audit secrets configuration, rotation, and access.
     
    Figure 9: The "Event history" window

    Figure 9: The “Event history” window

Additionally, Secrets Manager can work with CloudWatch Events to trigger alerts when administrator-specified operations occur in an organization (for example, to notify you of a secret deletion attempt).

Cleaning up the CloudFormation Stack

To delete the entire CloudFormation stack:

  1. Select the stack named RotateSSH from the CloudFormation console.
  2. Select Actions, and then Delete Stack. This will delete all AWS resources created by the stack.
  3. Repeat the steps above to delete the stack named MasterWorkers.
  4. From the AWS Secrets Manager console, delete the secret /dev/ssh. Read more about Deleting and Restoring a Secret in the AWS Secrets Manager User Guide.

Conclusion

In this post, we demonstrate how you can use AWS Secrets Manager to store, rotate, and deliver SSH keypairs in order to secure communication within a compute cluster. Keys are securely encrypted and stored in AWS Secret Manager, which will also rotate the keys and install public keys on all nodes for you. By using this method, you won’t have to manually deploy SSH Keys on the various EC2 instances or manually rotate them. APIs associated with secrets management and rotation are logged in CloudTrail for auditing and monitoring. This key rotation solution is serverless. It does not require any servers to maintain and can scale rapidly.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

Author

Assaf Namer

Assaf is a Senior Solutions Architect. He likes coding, hackathons, and enjoys helping customers building reliable and secure cloud solutions. Outside of work, Assaf enjoys spinning and tennis.

Author

Maitreya Ranganath

Maitreya is a Solutions Architect with the Enterprise team. He has a focus on Security and Compliance and enjoys helping customers architect secure, scalable, and cost-effective solutions on AWS.