AWS Big Data Blog
Ingest VPC flow logs into Splunk using Amazon Kinesis Data Firehose
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more.
December 2023: This post was reviewed and updated to remove the dependency on the AWS Lambda function according to the latest version in Splunk AWS Add-on (7.3.0).
In September 2017, during the annual Splunk.conf, Splunk and AWS jointly announced Amazon Kinesis Data Firehose integration to support Splunk Enterprise and Splunk Cloud as a delivery destination. This native integration between Splunk Enterprise, Splunk Cloud, and Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant delivery mechanism. We want to enable you to monitor and analyze machine data from any source and use it to deliver operational intelligence and optimize IT, security, and business performance.
With Kinesis Data Firehose, you can use a fully managed, reliable, and scalable data streaming solution to Splunk. In September 2022, AWS announced a new Amazon Virtual Private Cloud (Amazon VPC) feature that enables you to create VPC flow logs to send the flow log data directly into Kinesis Data Firehose as a destination. Previously, you could send VPC flow logs to either Amazon CloudWatch Logs or Amazon Simple Storage Service (Amazon S3) before it was ingested by other AWS or Partner tools. In this post, we show you how to use this feature to set up VPC flow logs for ingesting into Splunk using Kinesis Data Firehose.
Overview of solution
We deploy the following architecture to ingest data into Splunk.
We create a VPC flow log in an existing VPC to send the flow log data to a Kinesis Data Firehose delivery stream. This delivery stream do not require any lambda function for data transformation and has destination settings to point to the Splunk endpoint along with an HTTP Event Collector (HEC) token.
Prerequisites
- AWS account – If you don’t have an AWS account, you can create one. For more information, see Setting Up for Amazon Kinesis Data Firehose.
- Splunk AWS Add-on (version 7.3.0 or above) – Ensure you install the latest Splunk AWS Add-on app from Splunkbase in your Splunk deployment. This app provides the required source types and event types mapping to AWS machine data. With version 7.3.0, Splunk has released the feature for transforming VPC Flow logs ingested from both Vended Logs and CloudWatch Logs.
- HEC token – In your Splunk deployment, set up an HEC token with the source type
aws:cloudwatchlogs:vpcflow
.
Create a Kinesis Data Firehose delivery stream
In this step, you create a Kinesis Data Firehose delivery stream to receive the VPC flow log data and deliver that data to Splunk.
- On the Kinesis Data Firehose console, create a new delivery stream.
- For Source, choose Direct PUT.
- For Destination, choose Splunk.
- For Delivery stream name, enter a name (for example,
VPCtoSplunkStream
).
- In the Destination settings section, for Splunk cluster endpoint, enter your endpoint. If you’re using a Splunk Cloud endpoint, refer to Configure Amazon Kinesis Firehose to send data to the Splunk platform for different Splunk cluster endpoint values.
- For Splunk endpoint type, select Raw endpoint.
- For Authentication token, enter the value of your Splunk HEC that you created as a prerequisite.
- In the Backup settings section, for Source record backup in Amazon S3, select Failed events only so you only save the data that fails to be ingested into Splunk.
- For S3 backup bucket, enter the path to an S3 bucket.
- Complete creating your delivery stream.
The creation process may take a few minutes to complete.
Create a VPC flow log
In this final step, you create a VPC flow log with Kinesis Data Firehose as destination type.
- On the Amazon VPC console, choose Your VPCs.
- Select the VPC for which to create the flow log.
- On the Actions menu, choose Create flow log.
- Provide the required settings for Filter:
- If you want to filter the flow logs, select Accept traffic or Reject traffic.
- Select All if you need all the information sent to Splunk.
- For Maximum aggregation interval, select a suitable interval for your use case.Select the minimum setting of 1 minute interval if you need the flow log data to be available for near-real-time analysis in Splunk.
- For Destination, select Send to Kinesis Firehose in the same account if the delivery stream is set up on the same account where you create the VPC flow logs.If you want to send the data to a different account, refer to Publish flow logs to Kinesis Data Firehose.
- For Log record format, if you leave it at AWS default format, the flow logs are sent as version 2 format. Alternatively, you can specify which fields you need to be captured and sent to Splunk. For more information on log format and available fields, refer to Flow log records.
- Review all the parameters and create the flow log. Within a few minutes, you should be able to see the data in Splunk.
- Open your Splunk console and navigate to the Search tab of the Search & Reporting app.
- Run the following SPL query to look at sample VPC flow log records:
Clean up
To avoid incurring future charges, delete the resources you created in the following order:
- Delete the VPC flow log.
- Delete the Kinesis Data Firehose delivery stream.
- If you created a new VPC and new resources in the VPC, then delete the resources and VPC.
Conclusion
You can use VPC flow log data in multiple Splunk solutions, like the Splunk App for AWS Security Dashboards for traffic analysis or Splunk Security Essentials, which uses the data to provide deeper insights into the security posture of your AWS environment. Using Kinesis Data Firehose to send VPC flow log data into Splunk provides many benefits. This managed service can automatically scale to meet the data demand and provide near-real-time data analysis. Try out this new quick and hassle-free way of sending your VPC flow logs to Splunk Enterprise or Splunk Cloud Platform using Kinesis Data Firehose.
You can deploy this solution today on your AWS account by following the Kinesis Data Firehose Immersion Day Lab for Splunk
About the authors
Ranjit Kalidasan is a Senior Solutions Architect with Amazon Web Services based in Boston, Massachusetts. He is Partner Solutions Architect helping security ISV partners to co-build and co-market solutions with AWS. He brings over 20 years of experience in Information technology helping global customers implement complex solutions for Security & Analytics. You can connect with Ranjit in Linkedin.