How can I determine which SageMaker notebook instance made a particular API call if all the instances use the same IAM role?

Last updated: 2021-03-17

I have multiple Amazon SageMaker notebook instances. They all use the same AWS Identity and Access Management (IAM) role. The AWS CloudTrail event for each API action shows the same principalId (session name), no matter which notebook instance performed the action. How can I tell which notebook instance performed which API actions?

Short description

When you have multiple SageMaker instances with the same IAM role, you can't determine which notebook instance performed a particular API action with the CloudTrail event.

Example:

{
    "eventVersion": "1.05",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AAAAAAAAAAAAAAAAAA:SageMaker",
       
    "arn": "arn:aws:sts::111122223333:assumed-role/AmazonSageMaker-ExecutionRole/SageMaker",

Resolution

1.    Create an IAM execution role for the SageMaker notebook instance. Or, use an existing execution role. In the following steps, the Amazon Resource Name (ARN) for the execution role is arn:aws:iam::111122223333:role/service-role/AmazonSageMaker-ExecutionRole.

2.    Attach an IAM policy that includes sts:AssumeRole to the execution role. The sts:AssumeRole action allows the execution role to assume itself using a different session name. Example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::111122223333:role/service-role/AmazonSageMaker-ExecutionRole"
        }
    ]
}

3.    Create a Start notebook lifecycle configuration script similar to the following. This example script retrieves the notebook instance name and then uses the name as the session name.

Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this
software and associated documentation files (the "Software"), to deal in the Software
without restriction, including without limitation the rights to use, copy, modify,
merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

#!/bin/bash
# Create a new bash script file using the cat command
# The new bash script will be used to set up the cron job at the end of this script
cat >/home/ec2-user/scriptAssumeRole.sh <<'EOF'
#!/bin/bash

set -e

# Obtain the name of the notebook instance
nbname=$(jq -r '.ResourceName' /opt/ml/metadata/resource-metadata.json)

# Use the AWS Command Line Interface (AWS CLI) to obtain the Amazon Resource Name (ARN) of the IAM execution role
nbinfo=$(aws sagemaker describe-notebook-instance --notebook-instance-name $nbname)
nbrole=$(jq -r '.RoleArn' <<< "$nbinfo")

# Use the AWS CLI to get the new credentials and create a session name based on the notebook instance name
cred=$(aws sts assume-role --role-arn $nbrole --role-session-name $nbname)

# Initialize variables
AccessKeyId=""
SecretAccessKey=""
SessionToken=""

# Obtain individual values from credentials
AccessKeyId=$(jq -r '.Credentials.AccessKeyId' <<< "$cred")
SecretAccessKey=$(jq -r '.Credentials.SecretAccessKey' <<< "$cred")
SessionToken=$(jq -r '.Credentials.SessionToken' <<< "$cred")

# Obtain the Region of the notebook instance
nbregion=$(aws configure get region)

# Obtain the length of each variable for conditional testing
len1=${#AccessKeyId}
len2=${#SecretAccessKey}
len3=${#SessionToken}

# Write credentials to a new config file
cat > /home/ec2-user/.aws/config.new <<EOF1
[default]
region=$nbregion
aws_access_key_id=$AccessKeyId
aws_secret_access_key=$SecretAccessKey
aws_session_token=$SessionToken
EOF1

if [[ ($len1 -gt 0) && ($len2 -gt 0) && ($len3 -gt 0) ]];
then
  # Overwrite the config with a new config file only if credentials are obtained
  echo "Credentials obtained."
sudo mv /home/ec2-user/.aws/config.new /home/ec2-user/.aws/config
else
  echo "No credentials are available."
fi
EOF
chmod +x /home/ec2-user/scriptAssumeRole.sh

# Now run the script:
echo "Running Assume Role Script"
/home/ec2-user/scriptAssumeRole.sh

echo "Setting up cron job every 15 minutes"
(crontab -l 2>/dev/null; echo "*/15 * * * * /home/ec2-user/scriptAssumeRole.sh") | crontab -

4.    Create a SageMaker notebook instance and attach the lifecycle configuration script that you created in the previous step. For this example, assume that the notebook instance is named test-2.

5.    To identify the notebook instance that performed an API action, check the CloudTrail event. Under the userIdentity object, the principalId and arn show the notebook instance name. For example, the following event detail shows that the SageMaker notebook instance named test-2 made the API call.

{
    "eventVersion": "1.05",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AAAAAAAAAAAAAAAAAAAA:test-2",
        "arn": "arn:aws:sts::111122223333:assumed-role/AmazonSageMaker-ExecutionRole/test-2",
        "accountId": "111122223333",
        "accessKeyId": "AAAAAAAAAAAAAAAAAAAA",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "AAAAAAAAAAAAAAAAAAAA",
                "arn": "arn:aws:iam::111122223333:role/service-role/AmazonSageMaker-ExecutionRole",
                "accountId": "111122223333",
                "userName": "AmazonSageMaker-ExecutionRole"
            },
            "webIdFederationData": {},
            "attributes": {
                "mfaAuthenticated": "false",
                "creationDate": "2020-09-12T00:45:04Z"
            }
        },
        "invokedBy": "im.amazonaws.com"
    },
    "eventTime": "2020-09-12T00:49:04Z",
    "eventSource": "sagemaker.amazonaws.com",
    "eventName": "CreateEndpoint",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "im.amazonaws.com",
    "userAgent": "im.amazonaws.com",
    "requestParameters": {
        "endpointName": "sagemaker-mxnet-ep",
        "endpointConfigName": "sagemaker-mxnet-epc",
        "tags": []
    },
    "responseElements": {
        "endpointArn": "arn:aws:sagemaker:us-east-1:111122223333:endpoint/sagemaker-mxnet-ep"
    },
    "requestID": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "eventID": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "eventType": "AwsApiCall",
    "recipientAccountId": "111122223333"
}

Did this article help?


Do you need billing or technical support?