AWS Partner Network (APN) Blog
Filter and Stream Logs from Amazon S3 Logging Buckets into Splunk Using AWS Lambda
By Ameya Paldhikar, Partner Solutions Architect – AWS
By Marc Luescher, Sr. Solutions Architect – AWS
Splunk |
Amazon Web Services (AWS) customers of all sizes–from growing startups to large enterprises–manage multiple AWS accounts. Following the prescriptive guidance from AWS for multi-account management, customers typically choose to perform centralization of the AWS log sources (AWS CloudTrail logs, VPC flow logs, AWS Config logs) from their multiple AWS accounts within Amazon Simple Storage Service (Amazon S3) buckets in a dedicated log archive account.
The volume of logs stored in these centralized S3 buckets can be extremely high (multiple TBs/day) depending on the number of AWS accounts and the size of workload. In order to ingest the logs from S3 buckets in Splunk, customers normally use the Splunk add-on for AWS. This is deployed on Splunk Heavy Forwarders, which act as dedicated pollers to pull the data from S3.
These servers also need an ability to scale horizontally as the data ingestion volume increases in order to support near real-time ingestion of logs. This approach involves an additional overhead of managing the deployment, and there’s an increased cost for running this dedicated infrastructure.
Consider another use case where you want to optimize ingest license costs in Splunk by filtering and forwarding only a subset of logs from the S3 buckets to Splunk. An example of this is ingesting only the rejected traffic within the VPC flow logs where the field “action” == “REJECT”. The pull-based log ingestion approach currently does not offer a way to achieve that.
This post showcases a way to filter and stream logs from centralized Amazon S3 logging buckets to Splunk using a push mechanism leveraging AWS Lambda. The push mechanism offers benefits such as lower operational overhead, lower costs, and automated scaling. We’ll provide instructions and a sample Lambda code that filters virtual private cloud (VPC) flow logs with “action” flag set to “REJECT” and pushes it to Splunk via a Splunk HTTP Event Collector (HEC) endpoint.
Splunk is an AWS Specialization Partner and AWS Marketplace Seller with Competencies in Cloud Operations, Data and Analytics, DevOps, and more. Leading organizations use Splunk’s unified security and observability platform to keep their digital systems secure and reliable.
Architecture
The architecture diagram in Figure 1 illustrates the process for ingesting the VPC flow logs into Splunk using AWS Lambda.
Figure 1 – Architecture for Splunk ingestion using AWS Lambda.
- VPC flow logs for one or multiple AWS accounts are centralized in a logging S3 bucket within the log archive AWS account.
- The S3 bucket sends an “object create” event notification to an Amazon Simple Queue Service (SQS) queue for every object stored in the bucket.
- A Lambda function is created with Amazon SQS as event source for the function. This function polls the messages from SQS in batches, reads the contents of each event notification, and identifies the object key and corresponding S3 bucket name.
- The function then makes a “GetObject” call to the S3 bucket and retrieves the object. The Lambda function filters out the events that do not have the “action” flag as “REJECT”.
- The Lambda function streams the filtered VPC flow logs to Splunk HTTP Event Collector.
- VPC flow logs are ingested and are available for searching within Splunk.
- If Splunk is unavailable, or if any error occurs while forwarding logs, the Lambda function forwards those events to a backsplash S3 bucket.
Prerequisites
The following prerequisites exist at a minimum:
- Publish VPC flow logs to Amazon S3 – Configure VPC flow logs to be published to an S3 bucket within your AWS account.
- Create an index in Splunk to ingest the VPC flow logs.
Step 1: Splunk HTTP Event Collector (HEC) Configuration
To get started, we need to set up Splunk HEC to receive the data before we can configure the AWS services to forward the data to Splunk.
Figure 2 – Splunk data inputs.
- Select HTTP Event Collector and choose New Token.
- Configure the new token as per the details shown in Figure 3 below and click Submit. Verify the Source Type is set as aws:cloudwatchlogs:vpcflow.
Figure 3 – Splunk HEC token configuration.
- Once the Token has been created, choose Global Settings, verify that All Tokens have been enabled, and click Save.
Step 2: Splunk Configurations
Next, we need to add configurations within the Splunk server under props.conf to verify that line breaking, time stamp, and field extractions are configured correctly. Copy the contents below in props.conf in $SPLUNK_HOME/etc/system/local/. For more information regarding these configurations, refer to the Splunk props.conf documentation.
Step 3: Create SQS to Queue Event Notifications
Whenever a new object (log file) is stored in an Amazon S3 bucket, an event notification is forwarded to an SQS queue. Follow the steps below to create the SQS queue and configure a log centralization S3 bucket to forward event notifications.
- Access the Amazon SQS console in your AWS account and choose Create Queue.
- Select Standard type and choose a Queue name.
- Within Configurations, increase the Visibility timeout to 5 minutes, and the Message retention period to 14 days. Refer to the screenshot below for these configurations.
Figure 4 – Amazon SQS configuration.
- Enable Encryption for at-rest encryption for your queue.
- Configure the Access policy as shown below to provide the S3 bucket with permissions to send messages to this SQS queue. Replace the placeholders in <> with the specific values for your environment.
- Enable Dead-letter queue so that messages that aren’t processed from this queue will be forwarded to the dead-letter queue for further inspection.
Step 4: Forward Amazon S3 Event Notifications to SQS
Now that the SQS queue has been created, follow the steps below to configure the VPC flow log S3 bucket to forward the event notifications for all object create events to the queue.
- From the Amazon S3 console, access the centralized S3 bucket for VPC flow logs.
- Select the Properties tab, scroll down to Event notifications, and choose Create event notifications.
- Within General configurations, provide an appropriate Event name. Under Event types, select All object create events. Under Destination, choose SQS queue and select the SQS queue we created in the previous step. Click on Save changes and the configuration should look like this:
Figure 5 – Amazon S3 event notifications.
Step 5: Create a Backsplash Amazon S3 bucket
Now, let’s create a backsplash S3 bucket to verify that filtered data is not lost, in case the AWS Lambda function is unable to deliver data to Splunk. The Lambda function sends the filtered logs to this bucket whenever the delivery to Splunk fails. Please follow the steps in this documentation to create an S3 bucket.
Step 6: Create an AWS IAM Role for the Lambda Function
- From the AWS IAM console, access the Policies menu and select Create Policy.
- Select JSON as the Policy Editor option and paste the policy below. Replace the placeholders in <> with specific values for your environment. Once done, click Next.
- Enter a name and description for the policy, and select Create Policy.
- From the IAM console, access Roles and select Create role.
- Under Use Case, select Lambda and click on Next.
- On the Add Permissions page, select the AWS managed AWSLambdaBasicExecutionRole policy and the custom policy we just created prior to creating this role. Choose Next once both the policies are selected.
- Enter an appropriate role name and then choose Create role.
Step 7: Create Lambda Function to Filter and Push Logs to Splunk
- Access the AWS Lambda console and choose Create function.
- Under Basic information, enter an appropriate Function name and under Runtime choose the latest supported runtime for Python.
- Expand the Change default execution role option, select Use an existing role, and select the role we created in the previous section.
- Keep all other settings as default and select Create function.
- Once the function is created, select the Configuration tab within the function and edit the General configuration. Change the Timeout value to 5 min and click Save.
- Edit the Environment variables and add these key-value pairs. Make sure you replace the placeholders in <> with the appropriate values based on your environment. Once the environment variables are added, click Save:
- backup_s3 = <backsplash_Amazon_S3_bucket_name_created_in_the earlier_section>
- splunk_hec_token = <your_splunk_hec_token>
- splunk_hec_url = <your_splunk_url>:8088/services/collector/raw
- Select the Code tab within your function and update the lambda_function.py with the Python code below. You can also access the Python code from the lambda_splunk_function.py file within this GitHub repository.
- Configuration tab within the function and select Triggers.
- Click Add trigger and select SQS.
- Under the SQS queue drop-down, select the SQS we configured to store the S3 object-create event notifications.
- Select Activate trigger. Keep all other settings as default and select Add.