AWS Cloud Operations & Migrations Blog

Implementing AWS Session Manager logging guardrails in a multi-account environment

Raiffeisen Bank International (RBI), a prominent Austrian banking group, maintains a multi-account AWS environment that allows product teams to build and test new customer features at speed, but within the limits of central security guardrails. One of these guardrails requires central logging of all sessions established to Amazon Elastic Compute Cloud (Amazon EC2) instances across the organization. This enables the RBI security team to analyze organization-wide trends in user activity as well as apply consistent controls around threat detection, incident response, forensics, and log archival.

RBI engineers leverage Session Manager, a capability of AWS Systems Manager, to initiate user sessions to EC2 instances. This has several security advantages over establishing traditional Secure Shell (SSH) sessions. First, access to instances can be managed through existing AWS Identity and Access Management (IAM) roles and policies. Furthermore, with Session Manager, EC2 instances do not require inbound network-level access, which needs to be allow-listed on security groups or network ACLs. Lastly, Session Manager captures session logs and writes them to Amazon Simple Storage Service (Amazon S3) buckets or Amazon CloudWatch log groups. To enforce centralized session logging in a multi-account environment, RBI needed to account for some additional caveats though:

  • Logging isn’t available for Session Manager sessions that connect through port forwarding or SSH. This is because SSH encrypts all session data, and Session Manager only serves as a tunnel for SSH connections. As a result, RBI decided to stop using session documents of the “Port” type in all accounts.
  • Since the session document defines the target S3 bucket for session logging, RBI needed to make sure that only documents pointing to a dedicated bucket in the central log archive account could be used to launch sessions.
  • Due to the agent-based nature of Systems Manager, the ability to write log data to a specific S3 bucket is dependent on the permissions assigned to the EC2 instance profile. As such, RBI security needed to account for the event of a rogue actor circumventing logging through manipulating the role assigned to the EC2 instance.

All RBI AWS accounts are managed by AWS Organizations. Member account administrators at RBI have full freedom of managing their IAM permissions, as security is enforced through a standardized baseline setup of the account as well as service-control policies (SCPs). Due to syntax constraints, SCPs did not provide a granular-enough permission model to enforce all the above-mentioned security requirements in a preventive manner. In light of this, RBI, in concert with AWS Enterprise Support, decided to prepare a bespoke solution based on AWS Step Functions, AWS Lambda, and Amazon Simple Notification Service (Amazon SNS), which would implement continuous monitoring of running sessions for non-compliant configurations.

Solution overview

S3, KMS and SNS resources are deployed in the log archive account residing in a core OU of the AWS Organization. Member accounts in their dedicated OU’s contain resources required for monitoring AWS Session Manager session compliance: AWS Step Functions, AWS Lambda, Amazon SNS and Amazon CloudWatch.

Figure 1: High-level architecture of the solution

In order to establish a session towards an EC2 instance the following conditions must apply:

  • The EC2 instance must meet all prerequisites for Session Manager, such as having the SSM Agent pre-installed on the operating system (in the case of RBI this is part of a security-hardened “golden AMI”).
  • A predefined IAM Policy needs to be attached to the EC2 instance profile. The policy grants access to the S3 bucket residing in the log archive account, as well as the AWS Key Management Service (AWS KMS) key used for encrypting session data.
  • The default SSM-SessionManagerRunShell session document must be used for launching the session. This document is applied to all sessions initiated through the AWS Management Console as well sessions started via the AWS CLI command without specifying the session preferences document:

aws ssm start-session --target instance-id

Please note, that the SSM-SessionManagerRunShell document itself cannot be modified by member account admins, as this action is restricted by means of an SCP.

At the heart of the solution lies an AWS Step Functions state machine responsible for validating that all the above requirements hold true throughout the lifecycle of a session:

The state machine diagram displaying all the steps around session validation. The logic behind the workflow will be explained in the next paragraph.

Figure 2: AWS Step Functions state machine for continuous session validation

When a user initiates a Session Manager session with an EC2 instance, the following happens:

  1. An Amazon EventBridge events rule detects that event and triggers the AWS Step Functions workflow
  2. The workflow analyzes the event and checks if the user established the session using the SSM-SessionManagerRunShell document. If the answer is negative, it terminates the session.
  3. Next, the state machine proceeds to run a continuous check to validate that the instance profile attached to the EC2 Instance has not been tampered with. It terminates the session otherwise.
  4. Once the user concludes the session, the workflow will perform a final check to validate that session logs have been stored in the log archive account’s S3 bucket. This is a detective “safety net” if an attacker successfully circumvents all preventive measures.
  5. At any stage of the workflow, if the state machine detects session non-compliance, it will publish the following operational signals:
    • a CloudWatch custom metric
    • a notification to an SNS topic in the member account
    • a notification to an SNS topic in the log archive account

If the number of terminated sessions for a member account breaches a predefined threshold, a CloudWatch alarm will fire an additional SNS notification to the log archive account, pointing at a possible security threat. You can leverage all these signals for real-time integration with your local SIEM system, including automated security response. In RBI’s case, an SCP is temporarily applied to the member account to block Session Manager usage.

Solution walkthrough

Prerequisites

To deploy the solution, you will have to set up an AWS Organizations account with at least 2 member accounts. One of these member accounts will need to be designated as the log archive account, where the central S3 bucket, KMS key and SNS topic will reside. For both of these accounts you will need and IAM user or role with administrative privileges. Additionally, AWS CloudTrail needs to be configured. You can either set up an organizational trail or create a trail in the member account and region you wish to protect.

Before proceeding to the deployment steps please make sure to prepare your workstation through:

  1. Installing a Git client, such as GitHub Desktop
  2. Installing the AWS Command Line Interface (AWS CLI)
  3. Installing Python3 and the boto3 library

Steps to deploy

In order to get the solution up and running five steps are required:

  1. Deploy an AWS CloudFormation template to the log archive account
  2. Deploy an AWS CloudFormation template to the member account
  3. Configure a Service Control Policy (SCP) for the member account
  4. Configure an EC2 instance for Session Manager in the member account
  5. Attach a ready-made IAM policy to the EC2 Instance profile

Step 1: Deploy AWS CloudFormation template to the log archive account

The solution comes with a ready-made script to deploy the CloudFormation template to a region of your choice. Before running the below instructions make sure to configure your default AWS CLI profile with the security credentials of a designated IAM principal from the log archive account:

# Clone the repository
git clone https://github.com/aws-samples/ssm-monitoring-logging-guardrails-multiaccount.git

# Switch directories
cd ssm-monitoring-logging-guardrails-multiaccount

# Run deployment script (insert AWS Region without quotes)
make deploy-log-archive-stack AWSRegion=<AWSRegion>

Once the CloudFormation stack has been created, please make sure to write down the generated outputs:

  • CentralSSMSessionLoggingS3BucketName– the name of the S3 bucket for session log aggregation
  • CentralSSMSessionMonitoringSecurityComplianceSNSTopicArn– the ARN of the SNS Topic for alert notification
  • CentralSSMSessionMonitoringKMSKeyArn – the ARN of the AWS KMS key for encrypting the Systems Manager Session channel, SNS Topics and S3 objects.

You can also use the below script to retrieve the outputs at a later time:

# Get stack outputs from your log archive account
make describe-log-archive-stack-outputs AWSRegion=<AWSRegion>

Before moving to the next step consider setting up an e-mail subscription to the created alerting SNS topic to allow testing of security notifications published by the solution.

Step 2: Deploy AWS CloudFormation template to the member account

Pass the values obtained in Step 1 to the script responsible for deploying resources to the member account. Before running the below instructions make sure to configure your default AWS CLI profile with the security credentials of a designated IAM principal from that member account:

# Run deployment script (insert all values without quotes)
make deploy-member-account-stack AWSRegion=<AWSRegion> \
CentralSSMSessionLoggingS3BucketName=<CentralSSMSessionLoggingS3BucketName> \
CentralSSMSessionMonitoringKMSKeyArn=<CentralSSMSessionMonitoringKMSKeyArn> \
CentralSSMSessionMonitoringSecurityComplianceSNSTopicArn=<CentralSSMSessionMonitoringSecurityComplianceSNSTopicArn>

Once the script completes, you can optionally log into the member account and update the aws-ssm-guardrails-org-member-account stack with additional input parameters for the template, including:

  • SessionManagerIdleSessionTimeout – the inactive SSM Session timeout threshold, set in minutes
  • StepFunctionStateSleepSeconds – the frequency at which the AWS Step Functions state machine validates session compliance
  • TerminatedSessionsAlertTreshold – the threshold over a 5 minute period for the acceptable number of terminated sessions before an alert is sent to the central SNS topic

Step 3: Configure Service Control Policy (SCP) for the member account

Log into the AWS Organizations management account or Organizations delegated administrator account. Then navigate to the AWS Organizations console and attach the below SCP to the member account from Step 2:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyModifyDocument",
      "Effect": "Deny",
      "Action": [
        "ssm:UpdateDocument",
        "ssm:CreateDocument",
        "ssm:DeleteDocument"
      ],
      "Resource": [
        "arn:aws:ssm:*:*:document/SSM-SessionManagerRunShell"
      ]
    },
    {
      "Sid": "ProtectPolicy",
      "Effect": "Deny",
      "Action": [
        "iam:Create*",
        "iam:Delete*"
      ],
      "Resource": [
        "arn:aws:iam::*:policy/aws-ssm-guardrails-mandatory-policy-*"
      ]
    },
    {
      "Sid": "ProtectAlarm",
      "Effect": "Deny",
      "Action": [
        "cloudwatch:PutMetricAlarm",
        "cloudwatch:Disable*",
        "cloudwatch:Delete*"
      ],
      "Resource": [
        "arn:aws:cloudwatch:*:*:alarm:aws-ssm-monitoring-logging-guardrails-multiaccount"
      ]
    },
    {
      "Sid": "ProtectSNSTopicByArn",
      "Effect": "Deny",
      "Action": [
        "sns:AddPermission",
        "sns:RemovePermission",
        "sns:Create*",
        "sns:Delete*"
      ],
      "Resource": [
        "arn:aws:sns:*:*:aws-ssm-monitoring-logging-guardrails-multiaccount"
      ]
    },
    {
      "Sid": "ProtectStateMachineByArn",
      "Effect": "Deny",
      "Action": [
        "states:DeleteStateMachine",
        "states:CreateStateMachine",
        "states:UpdateStateMachine",
        "states:StopExecution"
      ],
      "Resource": [
        "arn:aws:states:*:*:stateMachine:aws-ssm-monitoring-logging-guardrails-multiaccount-statemachine",
        "arn:aws:states:*:*:execution:aws-ssm-monitoring-logging-guardrails-multiaccount-statemachine:*"
      ]
    },
    {
      "Sid": "ProtectOtherSolutionResourcesByArn",
      "Effect": "Deny",
      "Action": [
        "iam:*",
        "lambda:Delete*",
        "lambda:Put*",
        "lambda:Remove*",
        "lambda:Publish*",
        "lambda:Update*",
        "lambda:AddPermission",
        "lambda:RemovePermission",
        "events:*",
        "cloudformation:*"
      ],
      "Resource": [
        "arn:aws:iam::*:role/solution/*",
        "arn:aws:lambda:*:*:function:check-ssm-session-target-iam-role-compliance-function",
        "arn:aws:lambda:*:*:function:check-ssm-session-status-function",
        "arn:aws:lambda:*:*:function:check-ssm-session-s3-log-existence-function",
        "arn:aws:events:*:*:rule/aws-ssm-monitoring-logging-guardrails-multiaccount*",
        "arn:aws:cloudformation:*:*:stack/aws-ssm-guardrails-org-member-account/*"
      ]
    }
  ]
}

The above can also be achieved with AWS CloudFormation, provided you are currently governing your AWS Organizations resources as code.

Step 4: Configure EC2 instance for Session Manager in the member account

In the same region and member account, onboard an EC2 instance as a Systems Manager managed node, based on the following list of prerequisites. Make sure an IAM policy with Session Manager permissions, like AmazonSSMManagedInstanceCore, is attached to the instance profile. If needed, create an IAM user or role with appropriate permissions for starting a session.

As a means of further improving your security posture, please also consider setting up a VPC endpoint for Session Manager, as well as S3 and KMS.

Step 5: Attach IAM policy to the EC2 instance profile

Upon completing Step 2, the following IAM policy should have been created in the protected member account as part of the CloudFormation stack:

arn:aws:iam::${AWS::AccountId}:policy/aws-ssm-guardrails-mandatory-policy-${AWS::Region}

The policy will have the following content:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey"
      ],
      "Resource": [
        "arn:aws:kms:${AWS::Region}:{LogArchiveAccountId}:key/{KeyId}"
      ],
      "Effect": "Allow"
    },
    {
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:GetEncryptionConfiguration"
      ],
      "Resource": [
        "arn:aws:s3:::central-log-ssm-audit-${AWS::Region}-{OrganizationId}/{MemberAccountId}/*",
        "arn:aws:s3:::central-log-ssm-audit-${AWS::Region}-{OrganizationId}"
      ],
      "Effect": "Allow"
    }
  ]
}

This IAM policy needs to be attached to your EC2 instance profile. Once this is completed, you can test the solution through initiating a session through Session Manager and inspecting the content of the logs generated in the S3 bucket of the log archive account:

central-log-ssm-audit-${AWS::Region}-${OrganizationId}

Additionally, you can simulate malicious activity through removing the above-mentioned policy from the instance profile. Navigate to appropriate role definition in the IAM console and detach the policy through the “Permissions” tab, as shown in the image below:

The image shows a screenshot of the IAM console, specifically the “Permissions” tab for the role you used as part of your EC2 instance profile. The image highlights the actions you need to take to detach the mandatory solution policy from that role: select the policy and then click “Remove”.

Figure 3: Simulation of attacker activity

This action should result in session termination and a security notification being pushed to your inbox (provided you have configured a subscription to the central SNS alerting topic beforehand).

Clean Up

Step 1: Detach and remove Service Control Policy (SCP)

In order to clean up resources you first need to detach the Service Control Policy (SCP) created in Step 3 of the deployment procedure from the respective member account and remove it. You can do so by logging into the Organization’s management account or a delegated administrator account.

Step 2: Remove resources from the member account

In the next step, clean up resources from the member account using the below script:

# Delete aws-ssm-guardrails-org-member-account stack
make delete-member-account-stack AWSRegion=<AWSRegion>

Step 3: Remove resources from the log archive account

Once the solution has been deleted from all member accounts, run the below command for the log archive account (remember to update AWS CLI credentials to reflect an appropriate IAM principal in that account):

# Delete aws-ssm-guardrails-log-archive-account stack
make delete-log-archive-stack AWSRegion=<AWSRegion>

By default, the cleanup script does not retain any resources upon completion. It empties and deletes both S3 buckets (session logs, access logs) and schedules the KMS key for removal. However, if you set the template parameter “IsProductionDeployment” to “true” during Step 2 of the deployment procedure, data in the S3 buckets would be preserved and both S3 and KMS resources protected by means of a deletion policy.

Additional considerations

Blocking direct SSH access

Please note that the solution relies on additional guardrails to block direct SSH access to EC2 instances. AWS Firewall Manager content audit security group policies are one viable option to achieve this. The details of this however are beyond the scope of this blog post.

Network misconfiguration

A possible additional attack vector involves network misconfiguration of security groups or network ACLs, which could render it impossible for the SSM Agent to reach the central S3 and KMS resources. In such a case, the solution will fire a security incident notification once the session completes, prompting a further investigation by the security team. If required, customers can add more advanced detective or preventive measures to the solution in order to react to these kinds of risks more quickly (through using VPC Reachability Analyzer, for example).

Solution costs

You can balance the cost of AWS resources and security risk exposure through tuning the StepFunctionStateSleepSeconds template parameter value. This parameter determines the frequency at which the solution checks for ongoing session compliance. The cost of monitoring a 10 min session with the default StepFunctionStateSleepSeconds setting of 15 seconds amounts to around $0.008 (pricing based on us-east-1 region). Adjusting the parameter value can bring per-session costs up or down proportionally.

CloudTrail integration

The solution relies on AWS CloudTrail and Amazon EventBridge to notify it of a user initiating a new session. Please note that events from CloudTrail are delivered by EventBridge on a best effort basis. This means that, while CloudTrail attempts to send all events to EventBridge, in some very rare cases an event might not be delivered.

Log retention

For cost efficiency, the code included with this blog post configures lifecycle policies for the central S3 logging bucket, allowing you to set a finite retention period for log objects. However, if your particular security requirements warrant it, consider turning off lifecycle policies and enforcing additional protection with MFA delete.

Log streaming

By default, the solution leverages an S3 target for logging session data. You can re-configure it though (via a CloudFormation parameter) to make use of delivery to CloudWatch Logs, which you can then centralize for the purpose of near real-time SIEM integration. Please keep in mind the additional log ingestion and storage costs for CloudWatch, in case you decide to leverage this option.

Cloud9 support

You can leverage Session Manager to establish a session to your AWS Cloud 9 environment. Please note, however, port forwarding is required for this connection and as such will be terminated by the solution discussed above.

Conclusion

Session Manager allows you to access EC2 instances without opening inbound ports, while using IAM roles. In this post, we have shown how to apply additional guardrails to Session Manager to make sure all session activity is logged in a central account for the purposes of threat detection, incident response, forensics, and log archival. Feel free to explore the solution code on GitHub, including additional parametrization options that will help you adopt it to your particular use case.

Mladen Trampic

Mladen is Senior Technical Account Manager for AWS Enterprise Support EMEA. Mladen helps AWS customers architect for reliability and cost efficiency, striving to improve the operational excellence. When not working, Mladen enjoys building Computers and learning new things in his Home Lab together with his daughter.

Alan Pilawa

Alan is a Senior Technical Account Manager working primarily within the Financial Services sector. When he is not helping customers with AWS security, reliability, and operations concerns, he enjoys creating electronic music in his little home studio.