Consolidate and query AWS CloudTrail data across accounts and regions using AWS CloudTrail Lake
AWS CloudTrail allows tracking of user and API activities across your AWS infrastructure. AWS CloudTrail best practices recommend AWS customers set up separate trails for different use cases such as operational troubleshooting, auditing, security monitoring, etc. Once the use case is accomplished, customers might permanently delete some of the trails but choose to retain their Amazon S3 buckets that store CloudTrail events to meet compliance or auditing requirements or for future deep dives.
Customers leverage AWS CloudTrail Lake to provide their teams a fully managed, central query mechanism for CloudTrail events across accounts and regions. AWS CloudTrail Lake starts recording events from the time you create an event data store with the option of importing existing logs. Customers can choose not to import historical CloudTrail data at creation but might later need it to perform investigations. Customers may also want to import a small subset of past data rather than all available event logs from the S3 bucket.
In this blog post, I demonstrate how to set up a centralized CloudTrail Lake that consolidates historical CloudTrail event logs with new CloudTrail events. This blog leverages CloudTrail Lake’s ability to directly import data from S3 buckets for desired time ranges and augment data already residing in the Lake. Once you have created a consolidated event data store in CloudTrail Lake, you can use it to run queries on all your logs, including events brought over from your S3 buckets.
- Access to management account.
- An existing Amazon S3 bucket that contains CloudTrail logs delivered from an AWS CloudTrail Trail.
Step 1: Create a Multi Account – Multi Regions Event Data Store
a. Navigate to CloudTrail Console. Choose Lake in the left navigation pane of the CloudTrail console. On the Lake page, open Event data stores tab. Choose Create event data store. On the Configure event data store page, in General details, enter a name for the event data store (e.g. – import-existing-logs-lake). For the rest, keep defaults and select Next.
b. On Choose events page, under CloudTrail events check Enable for all accounts in my organization. Keep rest options as default, select Next. On Review and create page, select Create event data store.
c. The event data store status shows as Creation in progress which soon changes to Enabled. Choose the event data store you just created and note the Event data store ARN.
e. To query your event data store, navigate to CloudTrail Console, from left hand panel choose Lake, and then select Query. On right hand panel, choose Editor tab. The below query lists all the S3 buckets that I created. Be sure to replace ENTEREVENTDATASTOREID with actual Event Data store ID.
Please note that results contain S3 buckets created after creation of CloudTrail Lake.
Let’s import data for last 120 days from an existing S3 bucket.
Step 2: Import existing CloudTrail logs from S3 into CloudTrail Lake
Please refer to Considerations section in Working with CloudTrail Lake documentation before proceeding with this section. Also, please note for AWS CloudTrail Lake you pay for ingestion and storage together. For querying, you pay as you proceed. Find more details on AWS CloudTrail Lake pricing at AWS CloudTrail pricing page by navigating to Paid Tier on left hand panel and then choosing Lake tab.
CloudTrail Lake needs the right permissions to copy existing trail events from S3 bucket to the destination event data store. To fulfill this need, please follow the steps below:
a. Setup an IAM role with the below trust policy. Replace values for aws:SourceArn & aws:SourceAccount
c. Navigate to CloudTrail Console. From left hand panel, choose Lake. Select Event data stores tab and select the event data store that you had created earlier. From top right-hand corner, choose Actions drop down and select Copy trail events.