Networking & Content Delivery

Introducing Amazon VPC Flow Logs to Kinesis Data Firehose

Amazon Virtual Private Cloud (Amazon VPC) Flow Logs helps you understand network traffic patterns on AWS by providing network telemetry data about the IP traffic flowing to and from ENIs in your VPC. It lets you perform numerous analytics tasks, such as diagnosing overly restrictive security group rules, monitoring traffic that is reaching an instance, or determining traffic direction to and from the network interfaces. Previously, VPC Flow Logs could be sent to either Amazon CloudWatch Logs or Amazon Simple Storage Service (Amazon S3) before being ingested by other AWS or Partner tools. Today, we are adding the ability for you to publish VPC Flow Logs to Amazon Kinesis Data Firehose as a new destination.

Kinesis Data Firehose is a fully-managed service for delivering real-time streaming data to log management, analytics, and Security Information and Event Management (SIEM) systems. This new feature now lets you simplify log delivery to your log analytics tools that are natively integrated with Kinesis Data Firehose providing lower operational overhead and lower total cost of ownership.

In this post, you will learn how to create a VPC Flow Log subscription, publish to Kinesis Data Firehose data stream, and send the VPC Flow Logs to a supported destination. You should already be familiar with VPC Flow Logs and Kinesis Data Firehose.

Industry leading partners

We’re excited to be working with many of our existing VPC Flow Log partners at the launch of VPC Flow Logs to Kinesis Data Firehose. Before diving into the details of how this works, we want to highlight the partners that have integrated with this new feature at launch:

Solution overview

The following diagram shows a high-level overview of the VPC Flow Logs delivery from a source VPC to one of the many destination types that are supported by Kinesis Data Firehose. You can configure a VPC Flow Log to ingest directly into a Kinesis Data Firehose delivery stream. Optionally, the delivery stream can be configured to transform the source logs, and then load the logs to various destinations, including Amazon S3, Amazon Redshift, , and any HTTP endpoint that is owned by you or one of the partner solutions.

Figure 1 - Flow Logs to Kinesis Firehose

Figure 1 – Flow Logs to Kinesis Firehose

Walkthrough

Let’s get started with creating a new VPC Flow Log to Kinesis Data Firehose. For our example, Kinesis Data Firehose will send the VPC Flow Logs to a third-party service provider. However, the destination can be any supported destination, including Amazon OpenSearch and Amazon Redshift.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • You have created a new VPC in any region. Optionally, you can also choose an existing VPC.
  • You have created a destination to send the Kinesis Data Firehose data stream to, whether utilizing a third-party partner or an AWS service, such as Amazon Redshift or Amazon OpenSearch Service.
  • You have instances or resources deployed in the VPC that are sending traffic. No flow logs will be created without traffic flowing.

Create a Kinesis Data Firehose Data Stream

  1. Navigate to the Kinesis Data Firehose Data Stream console, and create a Kinesis Data Firehose data stream.
  2. Choose “Direct PUT” as the stream source.
  3. Choose a destination from the list.
Figure 2 - Create a Kinesis Data Firehose data stream

Figure 2 – Create a Kinesis Data Firehose data stream

  1. Enter a name for the Kinesis Data Firehose data stream.
  2. Optionally, if you want to perform any data manipulation or change the record format of the stream, then select the correct options in the “Transform and convert records” section.
  3. Depending on destination provider, there could be multiple options that are either required or optional in Destination settings. In this example, enter the HTTP endpoint URL and API key that was provided by the partner.
Figure 3 - Choose the destination settings for the third-party partner.

Figure 3 – Choose the destination settings for the third-party partner.

Each partner provides different capabilities and you can find out more on their websites (listed at the beginning of this blog).

  1. Select the S3 backup bucket to ensure data can be recovered if record processing transformation does not produce the desired results.
Figure 4 - Backup settings

Figure 4 – Backup settings

  1. After creating the Kinesis Data Firehose data stream, copy the ARN from the details page, as this will be needed when creating the VPC Flow Logs.
Figure 5 - Copy the ARN of the Kinesis Data Firehose data stream

Figure 5 – Copy the ARN of the Kinesis Data Firehose data stream

Create a VPC Flow Log

You can create a flow for a VPC, a subnet, or a network interface. If you create a flow log for a subnet or VPC, then each network interface in that subnet or VPC is monitored. For this example, we will create a flow log for the VPC.

  1. Navigate to the Amazon VPC console and create a new flow log.
  2. Enter a name for the VPC Flow Log.
  3. Select the type of traffic to capture in the flow log.
  4. Select Kinesis Data Firehose as the destination, selecting the ARN of the Kinesis Data Firehose data stream that you created.
Create a VPC Flow Log

Figure 6 – Create a VPC Flow Log

After you create a VPC Flow Log, it can take several minutes to begin collecting and publishing data to the Kinesis Data Firehose data stream. Flow logs do not capture real-time log streams for your network interfaces. Once Kinesis Data Firehose starts receiving data, it will aggregate the data and send it to the  partner endpoint that you can view in their solution.

Cleaning up

To avoid incurring future charges, delete the following resources:

  1. Delete the VPC Flow log.
  2. Delete the Kinesis Data Firehose data stream.
  3. Delete the contents in the partner’s solution.
  4. If you created a new VPC and new resources in the VPC, then delete the resources and VPC.

Things to know

  • Pricing for Kinesis Data Firehose can be found here.
  • Data ingestion and archival charges for vended logs will apply to the Kinesis Data Firehose destinations.
  • VPC Flow Log limits can be found here.

Conclusion

Amazon VPC Flow Logs to Kinesis Data Firehose simplifies log delivery to your log analytics tools that are natively integrated with Kinesis Data Firehose. This provides lower operational overhead and a lower total cost of ownership. You can start using them today. To learn more, visit VPC Flow Logs documentation.

Riggs Goodman III

Riggs Goodman III

Riggs Goodman is the Senior Global Tech Lead for the Networking Partner Segment at Amazon Web Services (AWS). Based in Atlanta, Georgia, Riggs has over 16 years of experience designing and architecting networking solutions for both partners and customers.

Faisal Pias

Faisal Pias

Faisal Pias is a Partner Solutions Architect at AWS. He works with networking and security ISV partners to build solutions that help accelerate cloud adoption journey for AWS customers. Faisal has been with AWS for 8 years and he’s passionate about solving and simplifying complex customer problems.