How do I push VPC flow logs to Splunk using Amazon Kinesis Firehose?

Last updated: 2021-07-23

I'm installing Splunk heavy forwarders to analyze my Amazon Virtual Private Cloud (Amazon VPC) data. I'm pushing data from AWS sources to Splunk clusters for processing, but it takes multiple steps. How can I better integrate my AWS data with Splunk?

Short description

Instead of using heavy forwarders, you can use Splunk's HTTP Event Collector (HEC) and Amazon Kinesis Data Firehose to send data to Splunk clusters.

To send data and application events to Splunk clusters, perform the following:

1.    Create a Kinesis Data Firehose delivery stream.

2.    Configure AWS Lambda for record transformation.

3.    Configure VPC Flow Logs.

4.    Create an Amazon CloudWatch Logs subscription to your stream.

Note: If you use an Application Load Balancer, use a Classic Load Balancer. Kinesis Data Firehose doesn't support Application Load Balancers or Network Load Balancers. Also, make sure to allow duration-based sticky sessions with cookie expiration deactivated. For more information about troubleshooting any data stream issues with Splunk endpoints, see Data not delivered to Splunk.

Resolution

Prerequisites

Before you begin, complete the following prerequisites:

Start creating the Kinesis Data Firehose delivery stream

1.     Create your delivery stream. For Source, choose Direct PUT or other sources.

2.     Choose Next.

Configure record transformation with AWS Lambda

1.     Configure record transformation.
Note:
Make sure to choose Enabled for Record transformation under Transform source records with AWS Lambda. You must enable this option because CloudWatch sends the logs as compressed .gzip files. Amazon Kinesis must extract these files before they're usable.

2.     For Lambda function, select Create new.

3.     On the Choose Lambda blueprint window, for Lambda blueprint, choose Kinesis Firehose CloudWatch Logs Processor.

4.     Select the new tab that opens in your browser to create the new Lambda function.
For Name, enter a name for the Lambda function.
For Role, select Create a custom role.

5.     Select the new tab that opens in your browser to create a new AWS Identity and Access Management (IAM) role.
For Role Name, be sure that the name is lambda_basic_execution.

6.     Choose Allow to create the role and return to the Lambda function configuration page.

7.     Choose Create function, and then wait for the function to be created.

8.     Increase the Timeout to 1 minute from the default 3 seconds to prevent the function from timing out.

9.     Choose Save.

Finish creating the Kinesis Data Firehose delivery stream

1.     Open the Amazon Kinesis console.

2.     In the navigation pane, select Data Firehose.

3.     For your delivery stream, choose Lambda function.
Choose the name of your new AWS Lambda function from the drop-down.
For Destination, choose Splunk.
Enter the Splunk HEC details, including the Splunk HEC endpoint that you created before. The Splunk HEC endpoint must be terminated with a valid SSL certificate. Use the matching DNS hostname to connect to your HEC endpoint. The format for the cluster endpoint is https://YOUR-ENDPOINT.splunk.com:8088.
For Splunk endpoint type, choose Raw endpoint, and then enter the authentication token.

4.     Choose Next.

5.     (Optional) Create an Amazon Simple Storage Service (Amazon S3) backup for failed events or all events by choosing an existing bucket or creating a new bucket. Make sure to configure Amazon S3-related settings such as buffer conditions, compression and encryption settings, and error logging options in the delivery stream wizard.

6.     Under IAM role, choose Create New.

7.     In the tab that opens, enter a Role name, and then choose Allow.

8.     Choose Next.

9.     Choose Create delivery stream.

Configure VPC Flow Logs

If you already have a VPC flow log that you want to use, skip to the next section.

1.     Open the CloudWatch console.

2.     In the navigation pane, choose Logs.

3.     For Actions select Create log group.

4.     Enter a Log Group Name.

5.     Choose Create log group.

6.     Open the Amazon VPC console.

7.     In the navigation pane under Virtual Private Cloud, choose Your VPCs.

8.     In the content pane, select your VPC.

9.     Choose the Flow logs view.

10.    Choose Create flow log.
For Filter, select All.
For Destination log group, select the log group you just created.
For IAM role, select an IAM role that allows your VPC to publish logs to CloudWatch.
Note:
If you don't have an appropriate IAM role, choose Set Up Permissions under IAM role. Choose Create a new IAM role. Leave the default settings selected. Choose Allow to create and associate the role VPCFlowLogs with the destination log group.

11.    Choose Create to create your VPC flow log.

12.    Establish a real-time feed from the log group to your delivery stream.
For AWS Lambda instructions, see Accessing Amazon CloudWatch Logs for AWS Lambda. For Amazon OpenSearch Service instructions, see Streaming CloudWatch Logs data to Amazon OpenSearch Service.
For Kinesis Data Firehose, create a CloudWatch Logs subscription in the AWS Command Line Interface (AWS CLI) using the following instructions.

Note: If you receive errors when running AWS CLI commands, make sure that you're using the most recent AWS CLI version.

Create a CloudWatch Logs subscription

1.     Grant access to CloudWatch to publish your Kinesis Data Firehose stream with the correct role permissions.

2.     Open the AWS CLI.

3.     Create your trust policy (such as TrustPolicyforCWLToFireHose.json) using the following example JSON file. Make sure to replace YOUR-RESOURCE-REGION with your resource's AWS Region.

{
  "Statement": {
    "Effect": "Allow",
    "Principal": { "Service": "logs.YOUR-RESOURCE-REGION.amazonaws.com" },
    "Action": "sts:AssumeRole"
  }
}

4.     Create the role with permissions from the trust policy using the following example command:

$ aws iam create-role --role-name CWLtoKinesisFirehoseRole --assume-role-policy-document file://TrustPolicyForCWLToFireHose.json

5.     Create your IAM policy (such as PermissionPolicyForCWLToFireHose.json) using the following example JSON file. Replace the following: YOUR-AWS-ACCT-NUM with your AWS account number,
YOUR-RESOURCE-REGION with your resource's Region, and
FirehoseSplunkDeliveryStream with your stream's name.

{
    "Statement":[
      {
        "Effect":"Allow",
        "Action":["firehose:*"],
        "Resource":["arn:aws:firehose:YOUR-RESOURCE-REGION:YOUR-AWS-ACCT-NUM:deliverystream/FirehoseSplunkDeliveryStream"]
      },
      {
        "Effect":"Allow",
        "Action":["iam:PassRole"],
        "Resource":["arn:aws:iam::YOUR-AWS-ACCT-NUM:role/CWLtoKinesisFirehoseRole"]
      }
    ]
}

6.     Attach the IAM policy to the newly created role using the following example command:

$ aws iam put-role-policy 
    --role-name CWLtoKinesisFirehoseRole 
    --policy-name Permissions-Policy-For-CWL 
    --policy-document file://PermissionPolicyForCWLToFireHose.json

7.     Create a subscription filter using the following example command. Replace YOUR-AWS-ACCT-NUM with your AWS account number, YOUR-RESOURCE-REGION with your resource's Region, and FirehoseSplunkDeliveryStream with your stream's name.

$ aws logs put-subscription-filter 
   --log-group-name " /vpc/flowlog/FirehoseSplunk" 
   --filter-name "Destination" 
   --filter-pattern "" 
   --destination-arn "arn:aws:firehose:YOUR-RESOURCE-REGION:YOUR-AWS-ACCT-NUM:deliverystream/FirehoseSplunkDeliveryStream" 
   --role-arn "arn:aws:iam::YOUR-AWS-ACCT-NUM:role/CWLtoKinesisFirehoseRole"