AWS Cloud Operations & Migrations Blog

Announcing AWS CloudTrail Lake integration with AWS Config

Organizations managing cloud infrastructure in AWS need effective mechanisms to audit operations in their AWS accounts for security and compliance. Early this year we announced the availability of AWS CloudTrail Lake a managed data lake that lets organizations aggregate, immutably store, and query events recorded by CloudTrail for auditing, security investigation, and operational troubleshooting. CloudTrail Lake can also store events from an organization in AWS Organizations in a CloudTrail Lake event data store, including events from multiple regions and accounts in order to aggregate this data.

In order to gather complete context during an investigation, security and compliance teams need to help provide the answers as to “Who” made the change, “When” did the change take place, “Which” resource was changed and finally  “What” has changed on the resource during the event.

AWS CloudTrail Lake can help a you identify the “Who” & “When” & “Which” when a security or compliance event has occurred. However, one challenge was providing the “What” has changed on the affected resource during an event. To gather this information, customers can leverage AWS Config which allows you to audit, access and evaluate the configurations of your AWS resources. To help identify a root cause of a security incident or a compliance investigation, it has been challenging to gather the “Who”, “When” and “Which resource” together with the “What has changed on the resource” in one central location.

Today, we are excited to announce a new CloudTrail Lake feature which allows storage of configuration items from AWS Config. This includes configuration details as well as compliance history for resources that AWS Config supports. With this feature, you can run queries against your event data store that is configured to store configuration items for AWS Config. We are also making available the ability to now create SQL-based queries which will allow you to join events between two event data stores. For example, you can now create a query which will show you “What” changes were made to an Amazon S3 Bucket using your event data store for AWS Config and then join it together with the event data store for CloudTrail to provide you with the “Who” & “When”. Please note, depending on the scenario, you may not be able to find a one-to-one match between configuration items from AWS Config and CloudTrail events. In that case, you will be able to find audit data in CloudTrail for any operations performed. For more information on this type of scenario, please see the AWS CloudTrail FAQs page.

In this blog post, we will show you how to get started using CloudTrail Lake to store configuration items from AWS Config and performing a few sample queries.

Prerequisites

To use CloudTrail Lake to store configuration items from AWS Config, as a prerequisite, AWS Config recorder must be setup for all your accounts and regions from which you want AWS CloudTrail to deliver configuration items from. You can use Quick Setup a capability of AWS Systems Manager, to help with the setup of the AWS Config recorder.

Creating an Event Data Store for configuration items from AWS Config

To get started with CloudTrail Lake to store configuration items from AWS Config, we first have to create an event data store. The event data store is an immutable storage location within CloudTrail that can include either CloudTrail events or configuration items from AWS Config.

  1. Navigate to the CloudTrail console
  2. In the left-hand navigation menu, choose Lake.
  3. Choose the Event Data Store tab.
  4. Choose Create event data store.
  5. Enter a name for the event data store.  For the purpose of the blog, we are using “aws-config-eds” as the name of the Event data store.
  6. Within the Configure event data store screen, you can configure the following settings:
    1. Retention period: You can specify a retention period for the event data store. The default is set to store them for seven years.
    2. Encryption: By default, your data is encrypted with a KMS key that AWS owns and manages for you. This is an optional setting that will allow you to use your own Customer Managed KMS Keys (CMK) to encrypt the activity logs stored in CloudTrail Lake.
    3. Tags: This is an optional setting where you can create tags for your event data store which will allow you to identify, sort and control access to your event data store using IAM polices to prevent the creation and deletion of an event data store based on tags.
  7. For the purpose of this post, leave these fields with the default settings and choose Next.
A “configure event data store” screen with options to enter the name, retention period and optional Tags.

Figure 1: AWS CloudTrail configure event data

  1. On the Choose events screen you can choose the type of Events you want to capture within you event data store.
  2. Choose Configuration items under the section Specify the type of AWS events.
  3. You will need to have AWS Config recording turned on for the current region and any other regions you want CloudTrail Lake to deliver configuration items from.
  4. By default, storing configuration items for AWS Config at the account level is configured.
      1. (Optional) Under Account and region settings section you can select “Include only the current region” to store configuration items for AWS Config from the current region.
      2. (Optional) Under Account and region settings section you can select “Enable for all accounts in my organization” from the managment account or from a delegated administrator account to store configuration items for AWS Config for your entire organization.
  5. Choose Next
A ”specify event types” screen allowing the user to select between CloudTrail events or configuration items from AWS Config with Configuration Items radio box selected.]

Figure 2.  AWS CloudTrail specify type of AWS events for event data store

  1. On the review page, verify your event data store configuration is correct and then choose Create event data store.

Sample queries

CloudTrail Lake provides various sample queries you can use to help you get started in querying the AWS Config configuration items captured by CloudTrail Lake. To use one of these sample queries, use the following steps:

  1. Navigate to the Samples queries.
  2. For this example, choose the AWS Config resource creation time from the list of sample queries.  This query will find the resource creation time for all AWS Config configuration items.
  3. The following sample query is automatically populated into the Query editor (you must replace $CONFIG_EDS_ID with the id of your event data store you created to store configuration items for AWS Config) :
SELECT
    eventData.configuration, eventData.accountId, eventData.awsRegion, eventData.resourceId,
    eventData.resourceName, eventData.resourceType, eventData.availabilityZone, eventData.resourceCreationTime
FROM
    $CONFIG_EDS_ID
WHERE
    eventTime > '2022-11-21 00:00:00' AND eventTime < '2022-11-22 00:00:00'
ORDER
    BY eventData.resourceCreationTime DESC limit 10;
  1. Select Run to display the results of the query
A screenshot displaying a sample results query to list 10 configuration items with their where the resource creation date is displayed.

Figure 3.  AWS CloudTrail sample query screen

Creating a CloudTrail Lake query that joins two event data stores

CloudTrail Lake also now supports the ability to use the JOIN function which allows you to join the data between multiple event data stores. This will allow you to create a query to display information related to CloudTrail events and information related to configuration items from AWS Config.

The following sample will show you how to use a query using the JOIN syntax to display information related to a PutBucketEncryption CloudTrail event for a S3 bucket to provide you the “Who” & “When” from the event data store used to capture CloudTrail events and the “What” has changed about the resource from the event data store used to record configuration items for AWS Config.  To clarify, we will find a user – “Who”, during a specific time period – “When”, & changes to our AWS resources – “What”.

  1. Navigate to the Query
  2. Select + to create new query
  3. Paste the contents of the query below into the editor window (you must replace $CLOUDTRAIL_EDS_ID and $CONFIG_EDS_ID with the IDs of your event data stores containing CloudTrail event data and AWS Config configuration items) :
Select
    config.eventdata.resourceId, config.eventData.resourceType, config.eventdata.configurationItemCaptureTime, cloudtrail.recipientAccountId, cloudtrail.awsRegion, cloudtrail.sourceIPAddress, cloudtrail.userAgent, cloudtrail.userIdentity.arn, cloudtrail.userIdentity.type, cloudtrail.userIdentity.sessionContext.sessionIssuer.userName, config.eventData.resourceType, element_at(config.eventdata.supplementaryConfiguration, 'ServerSideEncryptionConfiguration' 
    ) as ServerSideEncryptionConfiguration 
from
    $CLOUDTRAIL_EDS_ID as cloudtrail
join
    $CONFIG_EDS_ID as config on config.eventdata.resourceId = element_at(cloudtrail.requestParameters, 'bucketName'  
    )
where
    cloudtrail.eventname = 'PutBucketEncryption' 
    and (cloudtrail.eventTime >= '2022-11-17 00:00:00'  
        and cloudtrail.eventTime <= '2022-11-19 00:00:00'  
    )  
    and element_at(config.eventdata.supplementaryConfiguration, 'ServerSideEncryptionConfiguration' 
    ) is not null  
    and config.eventData.resourceType = 'AWS::S3::Bucket' 
    order by config.eventdata.configurationItemCaptureTime desc;
  1. Next, you must replace the time range that will be searched with the time range you want to use. The date string specified after eventTime >= is the earliest event timestamp that will be included, while the date string specified after eventTime
    <= is the latest event timestamp that will be included. (Note: you can use >= or <= to make the time stamp inclusive of the date/time provided. For a full list of all the operators supported, please see here. )
  2. Select Run to display the results of the query

The query will then display information related to the CloudTrail userIdentity
fields and also information related to the ServerSideEncryptionConfiguration field from the configuration item in AWS Config using the JOIN syntax. For more information on SQL constraints, see the CloudTrail Lake SQL constraints documentation.

Cleanup

If you no longer want to use the CloudTrail Lake event data store for configuration items from AWS Config, just make sure to delete the event data store. To do this follow these steps:

  1. Choose the Event data stores tab in the Lake console.
  2. Select the event data store from the list.
  3. From the Actions menu, select Change termination protection.
  4. From the change termination protection pop-up select “Disabled” and choose Save.
  5. From the Actions menu select Delete, confirm that you want to delete it by entering the name of the data store. Then choose Delete. This will place your event data store in the pending deletion state.
  6. In seven days, the data store will be deleted permanently.

Conclusion

In this blog, we demonstrated how to start using the new features of CloudTrail Lake store configuration items for AWS Config and also how to use JOIN function within a query to join multiple event data stores to help provide the “Who”, “When” & “What” information when needing to help with your security or compliance investigations.  Please refer to CloudTrail Lake user guide to explore the new features.

About the authors:

Isaiah Salinas

Isaiah Salinas is a Senior Specialist Solution Architect with the Cloud Operations Team. With over 10 years of experience working with AWS technology, Isaiah works with customers to design, implement, and support complex cloud infrastructures. He also enjoys talking with others about how to use AWS services to provide solutions to their problems.

Ania Develter

Ania Develter is a Senior Specialist Solutions Architect in the Cloud Operations team. Ania works with customers from all industries and helps them with their observability, compliance and centralized operations management challenges. She loves talking about Observability, CloudOps and DevOps.