AWS Cloud Operations & Migrations Blog

Gain Insights with Natural Language Query into your AWS environment using Amazon CloudTrail and Amazon Q in QuickSight

AWS CloudTrail tracks user and API activities across your AWS environments for governance and auditing purposes. Large enterprises typically use multiple AWS accounts, and many of those accounts might need access to a data lake managed by a single AWS account. By using Lake Formation integration with CloudTrail Lake, you can securely aggregate the data across multiple AWS accounts and centralize it within a single account using event data store for auditing purposes. More information about cross-account data sharing in Lake Formation is available in this public documentation.

Amazon QuickSight is a unified Business Intelligence service providing modern interactive dashboards, natural language querying, paginated reports, machine learning (ML) insights, and embedded analytics at scale. Powered by ML, Amazon Q uses natural language processing (NLP) to answer your business questions quickly. Q uses the same QuickSight datasets you use for your dashboards and reports, so your data is governed and secured. Just as data is prepared visually using dashboards and reports, it can be readied for language-based interactions using a topic. Topics are collections of one or more datasets that represent a subject area that your business users can ask questions about. To learn how to create a topic, refer to Creating Amazon QuickSight Q topics

In this blog post, we explain how you can enable users in your organization to ask and answer some common questions related to CloudTrail logs using their everyday language by using the Amazon QuickSight natural language query function, Amazon Q in QuickSight . Customers typically query Amazon CloudTrail logs for various purposes, including security, operational analysis, compliance, and auditing. Using QuickSight dashboards to visualize data and extract frequently used information makes it readily available to the users while eliminating the need of having direct access to CloudTrail Logs or CloudWatch to query this data for insights. Here are a few common scenarios where we can gain insights using CloudTrail logs and Amazon Q for QuickSight:

• Security Incident Investigation: If there is a suspected security incident, such as unauthorized access attempts, data breaches, or suspicious activities, customers can use CloudTrail logs to investigate the events leading up to the incident, identify the source, and take appropriate remedial actions.
• Compliance and Auditing: Many organizations operate in regulated industries and are required to maintain audit trails for compliance purposes. CloudTrail logs can provide a detailed record of API calls and resource changes, enabling customers to demonstrate compliance with industry standards and regulatory requirements.
• User Activity Monitoring: By analyzing CloudTrail logs, customers can monitor user activities within their AWS environment, including actions performed by specific users or roles. This can help identify potential misuse or unauthorized access attempts.
• Resource Monitoring and Tracking: CloudTrail logs can be used to monitor and track changes made to AWS resources, such as launching or terminating instances, creating, or modifying security groups, or modifying IAM policies. This information can be valuable for operational analysis, capacity planning, and troubleshooting.
• Cost Investigation and optimization: CloudTrail logs provide visibility into AWS resource usage, enabling organizations to identify underutilized or idle resources in combination with other AWS services such as AWS Config.
In this blog post, we will walk you through below examples for a security and compliance investigation:

1. How many events have suspicious source IP address grouped by event source?
2. How many event types are using TLS v 1.2?

Solution Overview:

For this solution, you will:
1. Create an Athena table from CloudTrail Lake EventDataStore and convert it into a view with relevant CloudTrail logs data.
2. Enable Amazon Q in QuickSight on the CloudTrail Lake dataset.
3. Use Amazon Q Topics to analyze CloudTrail logs to investigate some common use cases in CloudTrail logs using questions and answers.
4. Configure Amazon Q Topics to answer questions.

Prerequisites:

• An AWS account with permissions to enable CloudTrail
• For visualization, you should have Amazon QuickSight set up. If you do not, please follow the steps mentioned in this documentation to setup Amazon QuickSight

Solution architecture diagram
Figure 1: Solution architecture diagram

Step 1: Create an event data store with lake query federation and Athena view

In this section, you will create an event data store with CloudTrail Lake query federation option enabled to collect and store management events. This event data store will record the API calls.

  1. Navigate to the CloudTrail console. From the navigation pane, under Lake, choose Event data stores.
  2. From the right panel, choose Create event data store.
  3. On the Configure event data store page, in General details, enter a name such as “demo-cloudtrail-lake”.
  4. Specify the Pricing option as per your requirement and choose Retention period.
  5. Under Lake query federation, choose Enabled. Under Choose IAM role, select Create and use a new role.
  6. Keep the rest as default. Select Next.
  7. On the Choose Events page, keep defaults. Select Next. On the Review and create page, choose Create event data store.
  8. Using CloudTrail Lake query federation, AWS CloudTrail automatically creates a managed database and table in AWS Glue Data catalog.

    Console showing how to enable lake query federation and choose IAM role
    Figure 2: Console showing how to enable lake query federation and choose IAM role

  9. Select the Glue Resources and Navigate to Glue Tables. You can also view the Table in Athena. By default, the table would be created under “aws:cloudtrail” database.
  10. To view glue resources
    Figure 3: To view glue resources

  11. Run the following query against “default” database to create a view from above table that extracts relevant fields to be used in the QuickSight dataset.
  12. CREATE VIEW vw_quicksight_Ctlogs
    	AS
    	SELECT eventid AS "Event ID",
    	eventtime AS "Event Time",
    	eventname AS "Event Name",
    	eventtype AS "Event Type",
    	awsregion AS "AWS Region",
    	sessioncredentialfromconsole AS "Session Credentials Console",
    	CAST(useridentity.accountid AS varchar) AS "Account ID",
    	CAST(resources[1].arn AS varchar) AS "Resource ID",
    	CAST(resources[1].accountid AS varchar) AS "Resource Account ID",
    	eventsource AS "Event Source",
    	errorcode AS "Error Code",
    	errormessage AS "Error Message",
    	CAST(useridentity.principalid AS varchar) As "Principal Id",
    	sourceipaddress AS "Source IP Address",
    	recipientAccountId AS "Recipient Account Id",
    	vpcendpointid AS "VPC Endpoint",
    	CAST(tlsdetails.tlsversion AS varchar) As "TLS version",
    	CAST(tlsdetails.ciphersuite AS varchar) AS "Cipher Suite"
    	FROM "aws:cloudtrail"."xxxxxxxxxxxxxxxxxx"

    Note: Replace the table name before you run the query.

Step 2: Import Athena View to QuickSight

To begin visualizing CloudTrail logs in QuickSight, create a dataset using the Athena table created by the query in previous section.

Page showing how to import Athena view to QuickSight
Figure 4: Page showing how to import Athena view to QuickSight

Note: Please wait until the import is completed. This would depend on the volume of data being imported.

Step 3: Create Amazon Q Topic for CloudTrail Logs

To start using Amazon Q in QuickSight, please ensure you have Q enabled for QuickSight. For detailed instructions, refer to Getting started with Amazon QuickSight Q or watch the following video.

For users to access the Amazon Q Generative AI feature, they must be upgraded to Reader Pro, Author Pro or Admin Pro roles.

Page showing information on how to enable Generative AI features upgrade user roles
Figure 5: Page showing information on how to enable Generative AI features upgrade user roles

  1. To create an Amazon Q topic in QuickSight
  2. In this case, we would create a topic named “CloudTrail Logs Q”.
  3. Page showing how to create a new topic and select generative Q&A experience
    Figure 6: Page showing how to create a new topic and select generative Q&A experience

  4. Check the “Use new generative Q&A experience” checkbox and select the dataset in next step.
  5. Page showing how to select a dataset for topic
    Figure 7: Page showing how to select a dataset for topic

    Additionally, as an author when you create a new topic, you can:

  6. Add friendly names, synonyms, and descriptions to datasets and columns to improve Amazon Q’s answers.
  7. Share the Topic with your users so they can ask questions about the Topic.
  8. See questions your users are asking, how Amazon Q answered these questions, and improve upon the answer.

Page showing an example Q topic
Figure 8: Page showing an example Q topic

Amazon Q would identify relevant synonyms for each field. You can also define synonyms for the fields to provide more context to Amazon Q.

You can also add calculated fields to enable Q to answer some common questions.

You can create named entities to group together related fields that would be considered by Amazon Q to provide answers. For more details, please refer to “Creating Named Entities in QuickSight Q.” In this case we would create named entities:

  • Event: Event ID, Event Name, Event Type, Event Time and Account ID
  • Resource: Resource ID, Resource Account ID

Now, we are ready to move forward and use the topic we created to start asking questions about the data.

Step 4: Use Amazon Q Topic in QuickSight for Q&A

    To add the CloudTrail Logs Q Topic to your dashboard:

  1. Navigate to Analysis to visualize the CT-Logs-Dataset.
  2. Select Build Visual in the top blue bar.
  3. Link Topic to the Analysis.
  4. Page showing how to link a topic to analysis
    Figure 9: Page showing how to link a topic to analysis

    Once the topic is linked to your analysis, you can use Amazon Q in QuickSight to add visuals to your dashboard by asking questions around relevant datapoints. We’ve built the following dashboard by using the guidance here. Publish the dashboard for your users to start visualizing the data and asking more specific questions. Select Ask on the top right corner of the screen to start using Amazon Q in QuickSight to ask specific questions about your dataset.

    Page showing a sample published dashboard
    Figure 10: Page showing a sample published dashboard

    Scenario 1: How many events have suspicious source IP address grouped by event source?

    In this case, the IP address we are investigating is “52.94.133.132” as this is not a known IP address. So, we frame our question as:

    How many events where source IP address contains “52.94.133.132” by event source?

    Let’s understand the following output:
    1. We used the Q bar to ask the question regarding a source IP that could be a bad actor.
    2. This question can be Marked as verified by the Topic owner or the users to indicate a consistent output based on the search value. It also shows how the question was interpreted. Users can use the pencil icon to edit any fields or change how the question is being interpreted.
    3. Q provides a quick summary of the data in the existing dataset.
    4. The bar chart shows the number of events per event source for the source IP.
    5. As “Events” is a defined named entity, it also shows the associated event details as a secondary visual when the question is asked.

    Page showing Insights from Amazon Q for scenario 1
    Figure 11: Page showing Insights from Amazon Q for scenario 1

    Scenario 2: How many event types are using TLS v 1.2?

    Like previous question, the output for this question shows a relevant visual based on the data, such as:
    • Summary of the dataset.
    • Unique Number of events.
    • Number of events and trends by the day.
    • List of events considered to derive an answer for this question.

    Insights from Amazon Q for scenario 2
    Figure 12: Insights from Amazon Q for scenario 2

    Clean Up

    It’s a best practice to clean up any resources that you do not plan to continue using. This would avoid any unexpected charges.

    CloudTrail Lake event data store is charged based on the amount of data ingested. For more details, please review how CloudTrail Lake pricing works.

    • Make sure you stop ingesting the CloudTrail events to avoid unexpected cost.
    Delete the event data store.
    • We recommend that you unlink the topic from your analysis, so you are not charged for the capacity required to answer user questions.
    Unsubscribe from Amazon QuickSight Q.

    Please refer to Amazon Q in QuickSight pricing for more information.

    Conclusion

    In this post, we walked you through the steps to enable users across your organization to query CloudTrail logs using natural language through Amazon Q in QuickSight. We also demonstrated how you can enable query federation in CloudTrail Lake to ingest CloudTrail logs into Athena datasets. We also showed how to build an Amazon Q Topic to enable users to ask questions through QuickSight Dashboards.

    These are just a few examples of how AWS services integrate together to help you derive insights from your data faster. The specific scenarios and use cases may vary depending on the organization’s requirements, industry, and AWS environment complexity. You may want to review Best practices to enable Natural Language Query for users for a more effective outcome.

    About the authors:

    Snehal Nahar

    Snehal Nahar is a Principal Technical Account Manager (Security Specialist) at AWS. She is passionate about building innovative solutions using AWS services to help customers achieve their business objectives. She enjoys spending time with family and friends, playing board games and watching TV.

    Subha Kalia

    Subha is an Enterprise Support Lead (TAM) at AWS in North Carolina. She has over 17 years of experience in technology across various roles. She is passionate about problem solving on behalf of our customers to reduce operational challenges and friction. Her focus area is AI/ML and Healthcare Life Sciences. Outside work she enjoys traveling with her family, learning about different cultures and trying different cuisines.