How to create custom alerts with Amazon Macie

June 15, 2020: This blog is out of date. Please refer here for the updated info: https://aws.amazon.com/blogs/aws/new-enhanced-amazon-macie-now-available/

Amazon Macie is a security service that makes it easy for you to discover, classify, and protect sensitive data in Amazon Simple Storage Service (Amazon S3). Macie collects AWS CloudTrail events and Amazon S3 metadata such as permissions and content classification. In this post, I’ll show you how to use Amazon Macie to create custom alerts for those data sets to notify you of events and objects of interest. I’ll go through the various types of data you can find in Macie, and talk about how to identify fields that are relevant to a given security use case. What you’ll learn is a method of investigation and alerting that you’ll be able to apply to your own situation.

If you’re totally new to Macie, make sure you first head over to the AWS Blog launch post for instructions on how to get started.

Understand the types of data in Macie

There are three sources of data found in Macie: CloudTrail events, S3 Bucket metadata, and S3 Object metadata. Each data source is stored separately in its own index. Therefore, when writing queries it’s important to keep fields from different sources in separate queries.

Within each data source there are two types of fields: Extracted and Generated. Extracted fields come directly from pre-existing AWS fields associated with that data source. For example, Extracted fields from the CloudTrail data source come directly from the events recorded in the CloudTrail logs that correspond to the actions taken by an IAM user, role, or an AWS service. See these CloudTrail log file examples for more details.

Generated fields, on the other hand, are created by Macie. These fields provide additional security value and context for creating more powerful queries. For example, the Internet Service Provider (ISP) field is populated by taking the IP addresses from a CloudTrail event and matching them against a global service provider list. This enables searching for the ISP associated with an API call.

In the sections below, I’ll explore each of the three data sources in more detail. There are reference guides in the Macie documentation dedicated to each source that provide an extensive list of fields and descriptions. I’ll use these lists to discover what fields are appropriate for investigating a particular security use case.

Choose a security use case to instrument alerts

For the purposes of this post, I’ll choose one particular security use case to focus on. The process of discovering relevant data fields collected by Macie and turning those into custom alerts will be the same for any use case you wish to investigate. With that in mind, let’s choose the theme of sensitive or critical data stored in S3 to explore because it can affect many AWS customers.

When beginning to design alerts, the first step is to think about all of the resources, attributes, actions, and identities related to the subject. In this case, I’m looking at sensitive or critical data stored in S3, so the following are some potentially useful fields of data to consider:

S3 bucket and object resources
S3 configuration and security attributes
Read, write, and delete actions on S3
IAM users, roles, and access policies associated with the S3 resources

I’ll use this list to guide my search for relevant fields in the Macie reference documentation. First, I’ll dive into the CloudTrail data.

Explore CloudTrail data

The best way to build an understanding of what activities are happening in your AWS environment is by using the CloudTrail data source because it contains your AWS API calls and the identities that made them. If you’re unfamiliar with CloudTrail, head over to the AWS CloudTrail documentation for more information.

Ok, I have a critical S3 bucket and I want to be notified each time an object write is attempted to it outside of the typical access pattern used in my corporate environment. In this case, write actions to the bucket are normally controlled by my organization’s own identity system via federated access. So that means I want to write a query that searches for object write attempts by a non-federated principal. If you’re unfamiliar with what federation is, that’s ok, you can still follow along. If you’re curious about federation, see the IAM Identity Providers and Federation documentation.

I begin by opening the Macie CloudTrail Reference documentation and looking at the section titled “CloudTrail Data Fields Extracted by Macie.” Skimming over the list, I see that the objectsWritten.key description matches my criteria for investigating actions on an S3 resource: “A list of S3 objects’ ARNs that were part of a PutObject, CopyObject, or CompleteMultipartUpload API calls.”

Next, I take a look at the CloudTrail field names that are extracted from the userIdentity object in an event. The userIdentity.type field looks promising, but I’m unsure what values this field can accept. Using the CloudTrail userIdentity Element documentation as a reference, I look up the type field and see that FederatedUser is listed as one of the values.

Great! I have all of the information necessary to write my first custom query. I’ll search for all “user sessions” (a 5-minute aggregation of CloudTrail events corresponding to a user) that have both an attempted write to my S3 bucket and any API call with a userIdentity type which is not FederatedUser. It’s important to note that I’m searching groups of events and not individual events because we’re looking for events related to a specific API call rather than all events related to all API calls. To match values which do not equal FederatedUser, I use a regular expression and place FederatedUser inside parenthesis with a tilde character (~) at the beginning. Here’s what I came up with using the corresponding Macie field names and example search queries as a formatting guide:

objectsWritten.key:/arn:aws:s3:::my_sensitive_bucket.*/ AND userIdentityType.key:/(~FederatedUser)/

Now, it’s time to test this query on the Research page and turn it into a custom alert.

Create custom alerts

The first step to create a custom alert is to run the proposed query from the Research page. I do this so I can verify that the results match my expectations and to get an idea of how often the alert will occur. Once I’m happy with the query, setting up a custom alert is only a few clicks away.

In the Macie navigation pane, select Research.
Select the data source matching the query from the options in the drop-down menu. In this case, I select CloudTrail data.

Figure 1: Select the data source on the Research page
To run the search, I copy my custom query, paste it in the query box, and then select Enter.objectsWritten.key:/arn:aws:s3:::my_sensitive_bucket.*/ AND userIdentityType.key:/(~FederatedUser)/
At this point, I verify that the results match my expectations and make any necessary modifications to the query. To learn more about how the features of the Research page can help with verifying and modifying queries, see Using the Macie Research Tab.
Once I’m confident in the results, I select the Save query as alert icon.

Figure 2: Select the “Save query as alert” icon
Now, I fill in the remaining fields: Alert title, Description, Min number of matches, and Severity. For more information about each of these fields, see Macie Adding New Custom Basic Alerts.

Figure 3: Fill in the remaining “Alert title,” “Description,” “Min number of matches,” and “Severity” fields
In the Whitelisted users field, I add any users which appeared in my Research page results that I would like to exclude from the alert. For more details on this feature, see Whitelisting Users or Buckets for Basic Alerts.
Finally, I save the alert.

That’s it! I now have a custom basic alert that will alert me every time there’s an attempt to write to my bucket in the same user session as an API call with a userIdentityType other than FederatedUser. Now, I’ll use the S3 object and S3 bucket data sources to look for some more useful fields.

Use S3 object data

The S3 object data source contains fields extracted from S3 API metadata, as well as fields populated by Macie relating to content classification. To expand my search and alerts for sensitive or critical data stored in S3, I’ll look at the S3 Object Data Reference documentation and look for fields that are promising candidates.

I see that S3 Server Side Encryption Settings metadata could be useful because I know that all of the objects in a certain bucket should be encrypted using AES256, and I’d like to be notified every time an object is uploaded that doesn’t match that attribute.

To create the query, I combine the server_encryption field and the bucket field to match on all S3 objects within the specified bucket. Note the forward slashes and “.*” that make this a regular expression search. This allows me to match all buckets that share my project name, even when the full bucket name is different.

filesystem_metadata.bucket:/.*my_sensitive_project.*/ AND NOT filesystem_metadata.server_encryption:”AES256″

Next, I have a different bucket that I know should never have any objects which contain personally identifiable information (PII). This includes such information as names, email addresses, and credit card numbers. For the full list of what Macie considers PII, see Classifying Data with Amazon Macie. I’d like to set up an alert that notifies me every time an object containing any type of PII is added to this bucket. Since this is a data field that’s provided by Macie, I look under the Generated Fields heading and find the field pii_impact. I’m looking for all levels of PII impact, so my query will search for any value which isn’t equal to none.

As before, I’ll combine this with the bucket field to include all S3 objects matching the bucket name.

filesystem_metadata.bucket:”my_logs_bucket” AND NOT pii_impact:”none”

Use S3 bucket data

The S3 bucket data source contains information extracted from S3 bucket APIs, as well as fields that Macie generates by processing bucket metadata and access control lists (ACLs). Following the same method as before, I head over to the S3 Bucket Data Reference documentation and look for fields that will help me create useful alerts.

There are plenty of fields which could be useful here, depending on how much information I know in advance about my bucket and which type of security vector I want to protect against. To narrow my search, I decide to add some protection against accidental or unauthorized data destruction.

In the S3 Versioning section, I see that the Multi-Factor Authentication Delete settings are one of the available fields. Since I have this bucket locked down to only allow MFA delete actions, I can create an alert to notify me every time this delete action is disabled.

bucket:”my_critical_bucket” AND versioning.MFADelete:”disabled”

Another method of potential data destruction is through the bucket lifecycle expiration settings automatically removing data after a period of time. I’d like to know if someone changes this to a low number so I can make a modification and prevent losing any recent data.

bucket:”my_critical_bucket” AND lifecycle_configuration.Rules.Expiration.Days:<3

Now that I have gathered another set of potential alert queries, I can walk through the same steps I used for CloudTrail data to turn them into custom alerts. Once saved, I’ll begin to receive notifications in the Macie console whenever a match is found. I can view the alerts I’ve received by selecting Alerts from the Macie navigation pane.

Summary

I described the various types and sources of data available in Macie. After demonstrating how to take a security use case and discover relevant fields, I stepped through the process of creating queries and turning them into custom alerts. My goal has been to show you how to build alerts that are tailored to your specific environment and solve your own individual needs.

If you have comments about this post, submit them in the Comments section below. If you have questions about how to use Macie, or you’d like to request new fields and data sources, start a new thread on the Macie forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

AWS Security Blog

How to create custom alerts with Amazon Macie

Understand the types of data in Macie

Choose a security use case to instrument alerts

Explore CloudTrail data

Create custom alerts

Use S3 object data

Use S3 bucket data

Summary

Resources

Follow

Learn

Resources

Developers

Help