AWS Big Data Blog

Stream VPC flow logs to Amazon OpenSearch Service via Amazon Kinesis Data Firehose

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more.

Amazon Virtual Private Cloud (Amazon VPC) flow logs enable you to track the IP traffic going to and from the network interfaces in your VPC for your workloads. Analyzing VPC logs helps you understand how your applications are communicating over your VPC network with log records and acts as a main source of information to the network in your VPC. After collecting the flow logs, the next step is performing log analysis to understand user or application behavior and patterns to make informed decisions. You can analyze logs using log analytics tools such as Amazon OpenSearch Service.

Amazon Kinesis Data Firehose is a fully managed service for delivering near real-time streaming data to various destinations for storage and performing near real-time analytics. With its extensible data transformation capabilities, you can also streamline log processing and log delivery pipelines into a single Firehose delivery stream.

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. Amazon OpenSearch is an open source, distributed search and analytics suite. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), as well as visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions). Amazon OpenSearch Service currently has tens of thousands of active customers with hundreds of thousands of clusters under management processing trillions of requests per month.

In this post, you will learn how to ingest VPC flow logs with Kinesis Data Firehose and deliver them to an Amazon OpenSearch Service for analysis using OpenSearch Service Dashboards.

Overview of solution

This solution uses native integration of VPC flow logs streaming to Kinesis Data Firehose. We use a Firehose delivery stream to buffer the streamed VPC flow logs, and deliver those to an OpenSearch Service destination endpoint. We use Amazon OpenSearch Service Dashboards to create an index pattern for the VPC flow logs to analyze and visualize the logs in a near-real time. The following diagram illustrates this architecture.

Solution Architecture

We walk you through the following high-level steps:

  1. Create an OpenSearch Service domain for storing and analyzing the VPC flow logs.
  2. Create a Firehose delivery stream to deliver the flow logs to the OpenSearch Service domain.
  3. Create a VPC flow log subscription to the delivery stream.
  4. Explore VPC flow logs in OpenSearch Service Dashboards
    • Create role mapping with an OpenSearch Service user to the Kinesis Data Firehose service role. Because we’re using a public access domain for OpenSearch Service, we have to map the delivery stream AWS Identity and Access Management (IAM) role to the OpenSearch Service primary user to deliver logs in bulk to the OpenSearch Service domain.
    • Create an index pattern in OpenSearch Service Dashboards to enable analysis and visualization of VPC logs.

Prerequisites

As a prerequisite, you need to create an Amazon Simple Storage Service (Amazon S3) bucket to store the Firehose delivery stream backups and failed logs.

Create an Amazon OpenSearch Service domain

For demonstration purposes, and to limit the costs, we create an OpenSearch Service domain with the Development and testing deployment type and public access to the dashboard. For instructions, refer to Create an Amazon OpenSearch Service domain. Note that we select Public access only for demo purposes. For production, we recommend using VPC access for security reasons.

When it’s complete, the OpenSearch Service domain shows as Active.

OpenSearch Domain

Create a Kinesis Data Firehose delivery stream

Now that your Amazon OpenSearch Service domain is active, you can create a Firehose delivery stream where VPC flow logs are streamed.

  1. On the Amazon Kinesis console, choose Kinesis Data Firehose in the navigation pane, then choose Create delivery stream.
  2. Choose Direct PUT as the source and set the destination as Amazon OpenSearch Service.
  3. For Delivery stream name, enter PUT-OPENSEARCH-STREAM-DEMO.Kinesis Delivery Stream
  4. In the Destination settings section, choose Browse and choose the previously created Amazon OpenSearch Service domain.
  5. For Index name, enter vpcflowlogs.
  6. For Index rotation, choose Every day.
  7. For this post, we set Buffer size to 5 and Buffer interval to 900.You can modify these settings to optimize ingestion throughput and near-real-time behavior.
    Kinesis Stream Destination setting
  1. In the Backup settings section, for Source record backup in Amazon S3, select Failed events only so you only save the data that fails to deliver to Amazon OpenSearch Service.
  2. For S3 bucket, choose Browse and choose the S3 bucket you created to store failed logs and backups.
  3. Optionally, you can input a prefix for backup files and error files.
  4. Select GZIP for Compression for data records.
  5. For Encryption for data records, select Disabled.Kinesis Stream - Backup Setting
  6. Expand Advanced settings, and for Amazon CloudWatch error logging, select Enabled.
  7. Choose Create delivery stream.Kinesis Stream - Advance Setting

When the delivery stream is active, proceed to the next step.

Create a VPC flow logs subscription

Now you create a VPC flow logs subscription for the Firehose delivery stream you created in the previous step.

  1. On the Amazon VPC console, choose Your VPCs.
  2. Select the VPC for which to create the flow log.
  3. On the Actions menu, choose Create flow log.VPC Flow Log
  4. Select All to send all flow log records to Amazon OpenSearch Service.

If you want to filter the flow logs, you can select either Accept or Reject.

  1. For Maximum aggregation interval, select 10 minutes or the minimum setting of 1 minute if you need the flow log data to be available for near-real-time analysis in Amazon OpenSearch Service.
  2. For Destination, select Send to Kinesis Firehose in the same account if the delivery stream is set up on the same account where you create the VPC flow logs.
  3. For Log record format, if you leave it at AWS default format, the flow logs are sent as version 2 format.

Alternatively, you can specify which fields you need the flow logs to capture and send to an Amazon OpenSearch Service. For more information on log format and available fields, refer to Flow log records.

  1. Choose Create flow log.Create VPC Flow Logs

Now let’s explore the VPC flow logs in Amazon OpenSearch Service.

Explore VPC flow logs in Amazon OpenSearch Service Dashboards

In the final step, we set up OpenSearch Service Dashboards to explore the VPC flow logs.

  1. On the OpenSearch Service console, choose Domains in the navigation pane.
  2. Choose the domain you created.
  3. Under OpenSearch Dashboards URL, choose the link to open a new tab.OpenSearch Dashboard
  4. Log in with the user you created during OpenSearch Service domain setup.OpenSearch Service Dashboard
  5. Select Private for Select your tenant, then choose Confirm.OpenSearch Service Dashboard Tenant

Because we used a public access domain for OpenSearch Service, you need to map the role created for the Firehose delivery stream to the OpenSearch Service Dashboards user, so that the delivery stream can deliver logs in bulk to the OpenSearch Service domain.

  1. On the menu icon, choose Security.
  2. Choose Roles.
  3. Choose the all_access role.OpenSearch Service All Access Role
  4. On the Mapped users tab, choose Manage mapping.OpenSearch Service Dashboard map role
  5. For Backend roles, enter the IAM role ARN created for the Firehose delivery stream.
  6. Choose Map.OpenSearch Service Dashboard Map role arn
  7. Now that mapping is complete, choose the menu icon, then choose Stack management.OpenSearch Service Dashboard Stack Management
  8. Choose Index Patterns, then choose Create index pattern.
  9. For Index pattern name, enter vpcflowlogs*.
  10. Choose Next step.OpenSearch Service Dashboard Create Index
  11. Navigate to the Discover menu option.You can see the VPC flow logs from your VPC in this dashboard. Now you can search and visualize the flow logs that are being streamed in near-real time to the OpenSearch Service domain.
    OpenSearch Service Dashboard Discover

Clean up

After you test out this solution, remember to delete all the resources you created to avoid incurring future charges:

  1. Delete your Amazon OpenSearch Service domain.
  2. Delete the VPC flow logs subscription.
  3. Delete the Firehose delivery stream.
  4. Delete the S3 bucket for the VPC flow logs backup and failed logs.
  5. If you created a new VPC and new resources in the VPC, delete the resources and VPC.

Conclusion

In this post, we walked through a solution of how integrate VPC flow logs with a Kinesis Data Firehose delivery stream and deliver it to an Amazon OpenSearch Service destination with no code and visualize it in OpenSearch Service Dashboards.

Try this new quick and hassle-free way of sending your VPC flow logs to an Amazon OpenSearch Service using Kinesis Data Firehose.


About the Author

Chaitanya Shah is a Sr. Technical Account Manager with AWS, based out of New York. He has over 22 years of experience working with enterprise customers. He loves to code and actively contributes to the AWS solutions labs to help customers solve complex problems. He provides guidance to AWS customers on best practices for their AWS Cloud migrations. He is also specialized in AWS data transfer and the data and analytics domain.