Viewing permission issues with service-linked roles

Each AWS service requires explicit access to resources, endpoints, and objects that reside in the domain of another service. This is referred to as the permission boundary. Services like AWS Config, Amazon Macie, and AWS GuardDuty require an AWS Identity and Access Management (IAM) role that grants access to resources outside of its control. Understanding the actions that an IAM role grants (or restricts) to other objects in your AWS environment is crucial to maintaining your security posture and healthy operations.

For many customers, creating a service-linked role with the default permission set is adequate. Typically, a service-linked role is created when you initialize that service for the first time. However, customers who operate in heavily regulated industries, such as financial services or law enforcement, often have more granular permissions applied to their resources. In some use cases, a customer might have configured resources in their account that are denied access by these service-linked roles. This impacts the healthy functioning of these services. A mechanism to detect this sort of misconfiguration (whether performed deliberately or by accident) can be very useful for customers with the most stringent security requirements.

In this post, I will show you how to detect resources that cannot be accessed by service-linked roles and create a proactive mechanism to alert administrators when they are discovered. For a complete list of AWS services that use service-linked roles, see AWS services that work with IAM in the AWS Identity and Access Management User Guide.

Solution overview

When an AWS service that uses a service-linked role attempts to access resources that belong to another service (such as Amazon Simple Storage Service (Amazon S3) buckets or Amazon Elastic Compute Cloud (Amazon EC2) instances), the record of this attempt is recorded in AWS CloudTrail. The details of this log entry include the Amazon Resource Name (ARN) of the calling role, the action attempted, and the error code raised, if any.

CloudTrail can emit its log data into Amazon CloudWatch Logs, and once consumed, these logs can be converted into a metric using a metric filter. With this metric in place, you can treat events counted by CloudWatch Logs just like any other metric. You can also create CloudWatch alarms that notify administrators if these services cannot access resources. You can even extend this approach to include executing AWS Lambda functions or sophisticated remediation actions.

I discuss CloudWatch Insights, a query engine that you can use to inspect, aggregate, and analyze your logging data later in this post.

Figure 1 shows the services used in this solution. Amazon EC2, Amazon S3, and IAM are included here as examples.

Common AWS services include Amazon EC2, Amazon S3, IAM. Services like CloudTrail and CloudWatch are used to monitor for access errors by service-linked roles.

Figure 1: Solution overview

This solution does not resolve the issue of misconfigured resources independently. Instead, it presents a detective control that enables awareness of resources being blocked, counts access errors related to them, and guides how to investigate them when detected.

Prerequisite: CloudTrail setup

You need a CloudTrail trail that is delivering data to a CloudWatch log group.

To see if you have a trail set up in your environment:

Go to the AWS CloudTrail console, and choose Trails.
If you have no trails, follow the steps in the next procedure to create one. Otherwise, go to Configure an existing trail to deliver to CloudWatch Logs.

To help you make decisions about AWS Key Management Service encryption, organizational trails, log file validation, server-side encryption, and Amazon Simple Notification Service delivery for your trail, see the CloudTrail tutorial in the AWS CloudTrail User Guide.

Dashboard in the CloudTrail console lists trails, event history, and Contributor Insights.

Figure 2: AWS CloudTrail dashboard

Trails page in the CloudTrail console displays columns for trail name, home Region, multi-Region trail, insights, organization trail, S3 bucket, log file prefix, log group, and status.

Figure 3: List of trails in the AWS CloudTrail console

To create a trail:

In the AWS CloudTrail console, choose Create Trail.
Under CloudWatch Logs, select Enabled.
For Log group name, use the default.
Under IAM role, enter a name, and then choose Next.

The CloudWatch Logs section includes an option to enable CloudWatch Logs, use a new or existing log group, use a new or existing IAM role, and enter tags for resources.

Figure 4: Enabling CloudWatch Logs in the AWS CloudTrail console

On the Choose log events page, under Management events, select Read and Exclude AWS KMS events.
Choose Next, and then choose Create trail.

Choose log events provides options to select event type (management, data, insight) and choose the API activity to log (read, write, exclude KMS events).

Figure 5: Configuring log events in the AWS CloudTrail console

Configure an existing trail to deliver to CloudWatch Logs

To see if you have enabled delivery to CloudWatch Logs:

In the CloudTrail console, choose Trails.
In the list of trails, check the CloudWatch Logs log group column to see if delivery to CloudWatch is enabled.

Trails page of the CloudTrail console includes a column for CloudWatch Logs log group.

Figure 6: Trails page of the CloudTrail console

If your trail does not have a log group present, follow these steps:

In the console, choose the trail name.
In the CloudWatch Logs section, choose Edit.

The General details page for a trail includes information about trail name, trail log location, SNS notification delivery, last log file delivered, and more. The CloudWatch Logs section includes an Edit button.

Figure 7: Enabling CloudWatch Logs for a trail

Under CloudWatch Logs, select Enabled.
For Log group name, use the default.
Under IAM Role, enter a name for the IAM role.

Trail page displays a CloudWatch Logs section with options to use a new or existing log group and a new or existing IAM role.

Figure 8: Enable CloudWatch Logs in AWS CloudTrail

Step 1: Create the metric filter

Now I’ll create a metric filter that counts each occurrence of a failed request by an AWS service using a service-linked role.

In the CloudWatch console, choose Log groups.

Figure 9 shows the log group created in the previous step. The console page might look different to you, depending on the number of log groups in your environment.

Log groups page includes columns for retention, metric filters, Contributor Insights, and more.

Figure 9: List of CloudWatch log groups

Open the log group that matches the one in your CloudTrail configuration.

Log group details include retention (Never expire), creation time (10 minutes ago), stored bytes, ARN, KMS key ID, metric filters, and more.

Figure 10: CloudWatch log streams

To create a metric:

Choose Metric filters, and then choose Create metric filter.

Define pattern page includes Create filter pattern and Test pattern sections.

Figure 11: Define pattern page in the Amazon CloudWatch console

For Filter pattern, enter the following, and then choose Next.

{ $.eventName = Get* && $.errorCode = AccessDenied && $.userIdentity.sessionContext.sessionIssuer.arn = "arn:aws:iam::*:role/aws-service-role/*" }

Assign metric page provides fields for filter name and pattern and a section for metric details.

Figure 12: Assigning metrics in the Amazon CloudWatch console

For Metric namespace, I use Local, which gives me a convenient view for all of my locally created metrics (as opposed to AWS-managed metrics).
For Metric name, enter a name.
For Metric value, enter 1.
For Default value, enter 1.
Choose Next, and then on the review page, create the metric filter.

Step 2: Create a CloudWatch alarm

Now that you have created a metric filter, every occurrence of a failed API call by a service-linked role will result in an incremental datapoint delivered to CloudWatch metrics.

To view this metric and create an alarm:

In the left navigation pane, choose CloudWatch, and then choose Metrics.
Choose the namespace you used (in my example, Local) and then create a CloudWatch alarm by choosing the Create alarm action on the right side of the page.

View of the CloudWatch dashboard displaying the metric we created

Figure 13: Creating a CloudWatch alarm

Enter these parameters for your alarm:

Statistic: Sum
Period: 1 hour
Threshold type: Static
Whenever [metric name] is: Greater/Equal
Than: 1

Expand Additional configuration and use these parameters:

Datapoints to alarm: 1 out of 1
Missing data treatment: Treat missing data as good (not breaching threshold)

Figure 14 shows the completed page:

Conditions section displays options for threshold type (static and anomaly detection) and fields for defining the alarm condition.

Figure 14: Setting conditions for CloudWatch alarms

Choose Next, and on the Configure actions page, choose the actions that your alarm will perform.

The simplest example is to send messages to Amazon SNS (for example, to deliver email to an administrator). I am leaving the precise delivery details to your imagination, but if you are not familiar with the options on this page, I suggest delivering to an Amazon SNS topic.

Click through to the Preview and Create pages, and then choose Create alarm.

Step 3: Search for failed requests

Now that you have a metric filter recording every occurrence of an API error from service-linked roles, and you are alerted with these events through an Amazon CloudWatch alarm, you can actively search for them using Amazon CloudWatch Insights.

To search for failed requests:

In the Amazon CloudWatch console, choose Insights. From here, you can query multiple log groups directly by using a search syntax that will target the log entries you need.
Choose the log group or groups that you need to query, and then enter the following:

filter errorCode = 'AccessDenied'
| filter userIdentity.sessionContext.sessionIssuer.arn like 'aws-service-role'
| fields @timestamp, eventSource, eventName, userIdentity.sessionContext.sessionIssuer.arn, @message
| sort @timestamp desc

Insights page displays results that include timestamp, event source, event name, and more.

Figure 15: Search with Amazon CloudWatch Logs Insights

You can now see every occurrence of an API error code and expand the details of those requests to evaluate the root cause. You can even export these results or add them to a dashboard.

Here is an example of a CloudTrail event for a blocked S3 bucket evaluation by AWS Config. Although your events will be similar, the details will vary based on the service and event.

{
  "eventVersion": "1.05",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "REMOVED:AWSConfig-Describe",
    "arn": "arn:aws:sts::REMOVED:assumed-role/AWSServiceRoleForConfig/AWSConfig-Describe",
    "accountId": "REMOVED",
    "accessKeyId": "REMOVED",
    "sessionContext": {
      "sessionIssuer": {
        "type": "Role",
        "principalId": "REMOVED",
        "arn": "arn:aws:iam::REMOVED:role/aws-service-role/config.amazonaws.com/AWSServiceRoleForConfig",
        "accountId": "REMOVED",
        "userName": "AWSServiceRoleForConfig"
      },
      "webIdFederationData": {},
      "attributes": {
        "mfaAuthenticated": "false",
        "creationDate": "2020-11-05T01:11:44Z"
      }
    },
    "invokedBy": "AWS Internal"
  },
  "eventTime": "2020-11-05T01:11:46Z",
  "eventSource": "s3.amazonaws.com",
  "eventName": "GetBucketLocation",
  "awsRegion": "us-west-2",
  "sourceIPAddress": "AWS Internal",
  "userAgent": "AWS Internal",
  "errorCode": "AccessDenied",
  "errorMessage": "Access Denied",
  "requestParameters": {
    "bucketName": "REMOVED",
    "location": "",
    "Host": "REMOVED.s3.us-west-2.amazonaws.com"
  },
  "responseElements": null,
  "additionalEventData": {
    "SignatureVersion": "SigV4",
    "CipherSuite": "ECDHE-RSA-AES128-SHA",
    "bytesTransferredIn": 0,
    "AuthenticationMethod": "AuthHeader",
    "x-amz-id-2": "REMOVED",
    "bytesTransferredOut": 243
  },
  "requestID": "1A2668605F69DE1F",
  "eventID": "7cb234a0-e370-4039-aee8-b473a5945031",
  "readOnly": true,
  "resources": [
    {
      "accountId": "REMOVED",
      "type": "AWS::S3::Bucket",
      "ARN": "arn:aws:s3:::REMOVED"
    }
  ],
  "eventType": "AwsApiCall",
  "recipientAccountId": "REMOVED",
  "vpcEndpointId": "vpce-REMOVED"
}

Cleaning up

To delete the resources created in this blog post, delete these resources in this order:

If you created a new CloudTrail trail then delete this from the CloudTrail console. Note that it is our best practice to always retain at least one trail of management events, so be cautious that you do not leave your account without any trails. See Security Best Practices in AWS CloudTrail for more information.
Using the CloudWatch console, delete the CloudWatch Logs metric filter.
Delete the CloudWatch alarm created from this metric filter.
Delete the CloudWatch Logs log group you created above.

There is no need to delete the metric created by the metric filter as these do not incur ongoing charges and will be clean-up automatically.

Conclusion

In this post, I have shown how to detect, alert on, and search for attempts to access AWS resources that have been blocked from a service-linked role. This solution will be of particular interest to customers that leverage services such as Macie, GuardDuty, AWS Config, or other services that use a service-linked role to perform their function. This solution is extensible and can become part of a broader compliance approach.

About the author

Rich McDonough is a Solutions Architect for Amazon Web Services based in Toronto. His primary focus is on Management and Governance, helping customers scale their use of AWS safely and securely. Before joining AWS in 2018, he specialized in helping migrate customers into the cloud. Rich loves helping customers learn about AWS CloudFormation, AWS Config, and AWS Control Tower.

AWS Cloud Operations & Migrations Blog