Automate Amazon Athena queries for PCI DSS log review using AWS Lambda
In this post, I will show you how to use AWS Lambda to automate PCI DSS (v3.2.1) evidence generation, and daily log review to assist with your ongoing PCI DSS activities. We will specifically be looking at AWS CloudTrail Logs stored centrally in Amazon Simple Storage Service (Amazon S3) (which is also a Well-Architected Security Pillar best practice) and use Amazon Athena to query.
This post assumes familiarity with creating a database in Athena. If you’re new to Athena, please take a look at the Athena getting started guide and create a database before continuing. Take note of the bucket chosen for the output of Athena query results, we will use it later in this post.
In this post, we walk through:
- Creating a partitioned table for your AWS CloudTrail logs. In order to reduce costs and time to query results in Athena, we’ll show you how to partition your data. If you’re not already familiar with partitioning, you can learn about it in the Athena user guide.
- Constructing SQL queries to search for PCI DSS audit log evidence. The SQL queries that are provided in this post are directly related to PCI DSS requirement 10. Customizing these queries to meet your responsibilities may be able to assist you in preparing for a PCI DSS assessment.
- Creating an AWS Lambda function to automate running these SQL queries daily, in order to help address the PCI DSS daily log review requirement 10.6.1.
Create and partition a table
The following code will create and partition a table for CloudTrail logs. Before you execute this query, be sure to replace the variable placeholders with the information from your database. They are:
- <YOUR_TABLE> – the name of your Athena table
- LOCATION – the path to your CloudTrail logs in Amazon S3. An example is included in the following code. It includes the variable placeholders:
- <AWS_ACCOUNT_NUMBER> – your AWS account number. If using organizational CloudTrail, use the following format throughout the post for this variable: o-<orgID>/<ACCOUNT_NUMBER>
- <LOG_BUCKET> – the bucket name where the CloudTrail logs to be queried reside
Execute the query. You should see a message stating Query successful.
The preceding query creates a CloudTrail table and defines the partitions in your Athena database. Before you begin running queries to generate evidence, you will need to run alter table commands to finalize the partitioning.
Be sure to update the following variable placeholders with your information:
- <YOUR_DATABASE> – the name of your Athena database
Provide values for the following variables:
- region – region of the logs to partition
- month – month of the logs to partition
- day – day of the logs to partition
- year – year of the logs to partition
- LOCATION – the path to your CloudTrail logs in Amazon S3 to partition, down to the specific day (should match the preceding values of region, month, day, and year). It includes the variable placeholders:
After the partition has been configured, you can query logs from the date and region that was partitioned. Here’s an example for PCI DSS requirement 10.2.4 (all relevant PCI DSS requirements are described later in this post).
Create a Lambda function to save time
As you can see, this process above can involve a lot of manual steps as you set up partitioning for each region and then query for each day or region. Let’s simplify the process by putting these into a Lambda function.
Use the Lambda console to create a function
To create the Lambda function:
- Open the Lambda console and choose Create function, and select the option to Author from scratch.
- Enter Athena_log_query as the function name, and select Python 3.8 as the runtime.
- Under Choose or create an execution role, select Create new role with basic Lambda permissions.
- Choose Create function.
- Once the function is created, select the Permissions tab at the top of the page and select the Execution role to view in the IAM console. It will look similar to the following figure.
Update the IAM Role to allow Lambda permissions to relevant services
- In the IAM console, select the policy name. Choose Edit policy, then select the JSON tab and paste the following code into the window, replacing the following variable and placeholders:
- us-east-1 – This is the region where resources are. Change only if necessary.
- <OUTPUT_LOG_BUCKET> – bucket name you chose to store the query results when setting up Athena.
Note: Depending on the environment, this policy might not be restrictive enough and should be limited to only users needing access to the cardholder data environment and audit logs. More information about restricting IAM policies can be found in IAM JSON Policy Elements: Condition Operators.
- Choose Review policy and then Save changes.
Customize the Lambda Function
- On the Lambda dashboard, choose the Configuration tab. In Basic settings, increase the function timeout to 5 minutes to ensure that the function always has time to finish running your queries, and then select Save. Best Practices for Developing on AWS Lambda has more tips for using Lambda.
- Paste the following code into the function editor on the Configuration tab, replacing the existing text. The code includes eight example queries to run and can be customized as needed.
The first query will add partitions to your Amazon S3 logs so that the following seven queries will run quickly and be cost effective.
This code combines the partitioning, and example Athena queries to assist in meeting PCI DSS logging requirements, which will be explained more below:
Replace these values in the code that follows:
- REGION1 – first region to partition
- REGION2 – second region to partition*
Note: More regions can be added if you have additional regions to partition. The ADD partition statement can be copied and pasted to add additional regions as needed. Additionally, you can hard code the regions for your environment into the statements.
- Choose Save in the top right.
Athena Queries used to collect evidence
The queries used to gather evidence for PCI DSS are broken down from the Lambda function we created, using the partitioned date example from above. They are listed with their respective requirement.
Note: AWS owns the security OF the cloud, providing high levels of security in alignment with our numerous compliance programs. The customer is responsible for the security of their resources IN the cloud, keeping its content secure and compliant. The queries below are meant to be a proof of concept and should be tailored to your environment.
10.2.1/10.2.3 – Implement automated audit trails for all system components to reconstruct access to either or both cardholder data and audit trails:
10.2.2 – Implement automated audit trails for all system components to reconstruct all actions taken by anyone using root or administrative privileges.
10.2.4 – Implement automated audit trails for all system components to reconstruct invalid logical access attempts.
10.2.5.b – Verify all elevation of privileges is logged.
10.2.5.c – Verify all changes, additions, or deletions to any account with root or administrative privileges are logged:
10.2.6 – Implement automated audit trails for all system components to reconstruct the initialization, stopping, or pausing of the audit logs:
10.6 – Review logs and security events for all system components to identify anomalies or suspicious activity:
You can use the AWS Command Line Interface (AWS CLI) to invoke the Lambda function using the following command, replacing <YOUR_FUNCTION> with the name of the Lambda function you created:
The AWS Lambda API Reference has more information on using Lambda with AWS CLI.
Note: the results from the function will be located in the OUTPUT_LOCATION variable within the Lambda function.
Use Amazon CloudWatch to run your Lambda function
You can create a rule in CloudWatch to have this function run automatically on a set schedule.
Create a CloudWatch rule
- From the CloudWatch dashboard, under Events, select Rules, then Create rule.
- Under Event Source, select the radio button for Schedule and choose a fixed rate or enter in a custom cron expression.
- Finally, in the Targets section, choose Lambda function and find your Lambda function from the drop down.
The example screenshot shows a CloudWatch rule configured to invoke the Lambda function daily:
- Once the schedule is configured, choose Configure details to move to the next screen.
- Enter a name for your rule, make sure that State is enabled, and choose Create rule.
Check that your function is running
You can then navigate to your Lambda function’s CloudWatch log group to see if the function is running as intended.
To locate the appropriate CloudWatch group, from your Lambda function within the console, select the Monitoring tab, then choose View logs in CloudWatch.
You can take this a step further and set up an SNS notification to email you when the function is triggered.
In this post, we walked through partitioning an Athena table, which assists in reducing time and cost when running queries on your S3 buckets. We then constructed example SQL queries related to PCI DSS requirement 10, to assist in audit preparation. Finally, we created a Lambda function to automate running daily queries to pull PCI DSS audit log evidence from Amazon S3, to assist with the PCI DSS daily log review requirement. I encourage you to customize, add, or remove the SQL queries to best fit your needs and compliance requirements.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.