How to Visualize and Refine Your Network’s Security by Adding Security Group IDs to Your VPC Flow Logs
September 9, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
August 31, 2020: The directions in this blog post for how to create an Amazon ES cluster have been updated.
February 28, 2019: The features and services described in this post have changed since the post was published and the procedures described might be out of date and no longer accurate. If we update this post or create a replacement, we’ll add a notification about it here.
July 11, 2017: In response to readers’ feedback, the author of this blog post has updated this post’s example code to provide more reliable handling of error scenarios, particularly in which the geographical lookup fails. Additionally, the author has added details about testing the example code by using Amazon Kinesis Data Generator. If you already deployed the example code, follow the “Deploy Lambda Functions” section of the README file (you can skip creating the Amazon S3 bucket); if you have not deployed the example code, follow all the instructions in the README file. Functionally, the example code will not change, but it will be more reliable in delivering records to Amazon Elasticsearch Service (Amazon ES).
Many organizations begin their cloud journey to AWS by moving a few applications to demonstrate the power and flexibility of AWS. This initial application architecture includes building security groups that control the network ports, protocols, and IP addresses that govern access and traffic to their AWS Virtual Private Cloud (VPC). When the architecture process is complete and an application is fully functional, some organizations forget to revisit their security groups to optimize rules and help ensure the appropriate level of governance and compliance. Not optimizing security groups can create less-than-optimal security, with ports open that may not be needed or source IP ranges set that are broader than required.
Last year, I published an AWS Security Blog post that showed how to optimize and visualize your security groups. Today’s post continues in the vein of that post by using Amazon Kinesis Firehose and AWS Lambda to enrich the VPC Flow Logs dataset and enhance your ability to optimize security groups. The capabilities in this post’s solution are based on the Lambda functions available in this VPC Flow Log Appender GitHub repository.
Removing unused rules or limiting source IP addresses requires either an in-depth knowledge of an application’s active ports on Amazon EC2 instances or analysis of active network traffic. In this blog post, I discuss a method to:
- Use VPC Flow Logs to capture information about the IP traffic in an Amazon VPC.
- Enrich the VPC Flow Logs dataset with security group IDs by using Firehose and Lambda.
- Demonstrate how to visualize and analyze network traffic from VPC Flow Logs by using Amazon ES.
Using this approach can help you remediate security group rules to necessary source IPs, ports, and nested security groups, helping to improve the security of your AWS resources while minimizing the potential risk to production environments.
As illustrated in the preceding diagram, this is how the data flows in this model:
- The VPC posts its flow log data to Amazon CloudWatch Logs.
- The Lambda ingestor function passes the data to Firehose.
- Firehose then passes the data to the Lambda decorator function.
- The Lambda decorator function performs a number of lookups for each record and returns the data to Firehose with additional fields.
- Firehose then posts the enhanced dataset to the Amazon ES endpoint and any errors to Amazon S3.
Step 1: Set up your Amazon ES cluster and VPC Flow Logs
Create an Amazon ES cluster
The first step in this solution is to create an Amazon ES cluster. Do this first because it takes some time for the cluster to become available. If you are new to Amazon ES, you can learn more about it in the Amazon ES documentation.
To create an Amazon ES cluster:
- In the AWS Management Console, choose Elasticsearch Service under Analytics.
- Choose Create a new domain.
- Select the deployment type based on your requirements. If you use this domain for production purposes, then select Production. If you use this domain for development/testing, then select Development & Testing.
- Select the latest version and select Next.
- Type es-flowlogs for the Elasticsearch domain name.
- For Availability Zones, choose 1-AZ, 2-AZ, or 3-AZ. For more information, see Configuring a Multi-AZ Domain.
- Select the number of nodes under Data nodes to 2, and accept the defaults for the rest of the page.
- [Optional] If you use this domain for production purposes, I recommend using dedicated master nodes. Select the Enable dedicated master nodes checkbox, and leave the default value for Instance Type and Number of Master nodes.
- Choose Next.
- Under Network Configuration, select the VPC, Subnet, and Security Groups where you want to deploy this service. If you select public access, then I would recommend specifying the list of valid IPv4 addresses under Access Policy to allow access to this domain from specific IPv4 addresses. Accept the defaults for the rest of the fields.
- For more information about enabling access for specific AWS Identity and Access Management (IAM) users or roles, see Configuring Access Policies. Also, see How to Control Access to Your Amazon Elasticsearch Service Domain for an in-depth treatment of security with Amazon ES. See Set Access Control for Amazon Elasticsearch Service and Secure Your Elasticsearch Development Domain Using Amazon WorkSpaces for lighter treatments focused on getting started quickly.
- Choose Next.
- On the next page, choose Confirm and create.
It will take a few minutes for the cluster to be available. In the meantime, you can begin enabling VPC Flow Logs.
Enable VPC Flow Logs
VPC Flow Logs is a feature that lets you capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data is stored using Amazon CloudWatch Logs. For more information about VPC Flow Logs, see VPC Flow Logs and CloudWatch Logs.
To enable VPC Flow Logs:
- In the AWS Management Console, choose CloudWatch under Management Tools.
- Click Logs in the navigation pane.
- From the Actions drop-down list, choose Create log group.
- Type Flowlogs as the Log Group Name.
- In the AWS Management Console, choose VPC under Networking & Content Delivery.
- Choose Your VPCs in the navigation pane, and select the VPC you would like to analyze. (You can also enable VPC Flow Logs on only a subnet if you do not want to enable it on the entire VPC.)
- Choose the Flow Logs tab in the bottom pane, and then choose Create Flow Log.
- In the text beneath the Role box, choose Set Up Permissions (this will open an IAM management page).
- Choose Allow on the IAM management page. Return to the VPC Flow Logs setup page.
- Choose All from the Filter drop-down list.
- Choose flowlogsRole from the Role drop-down list (you created this role in steps 3 and 4 in this procedure).
- Choose Flowlogs from the Destination Log Group drop-down list.
- Choose Create Flow Log.
Step 2: Set up AWS Lambda to enrich the VPC Flow Logs dataset with security group IDs
If you completed Step 1, VPC Flow Logs data is now streaming to CloudWatch Logs. Next, you will deploy two Lambda functions. The first, the ingestor function, moves the data into Firehose, and the second, the decorator function, adds three new fields to the VPC Flow Logs dataset and returns records to Firehose for delivery to Amazon ES.
The new fields added by the decorator function are:
- Direction – By comparing the primary IP address of the elastic network interface (ENI) in the destination IP address, you can set the direction for the IP connection.
- Security group IDs – Each ENI can be associated with as many as five security groups. The security group IDs are added as an array in the record.
- Source – This includes a number of fields that result from looking up srcaddr from a free service for geographical lookups.
- The Source includes:
- source-location, latitude, and longitude.
- The Source includes:
Follow the instructions in this GitHub repository to deploy the two Lambda functions and the associated permissions that are required.
Step 3: Set up Firehose
Firehose is a fully managed service that allows you to transform flow log data and stream it into Amazon ES. The service scales automatically with load, and you only pay for the data transmitted through the service.
To create a Firehose delivery stream:
- In the AWS Management Console, choose Kinesis under Analytics.
- Choose Go to Firehose and then choose Create Delivery Stream.
Step 3.1: Define the destination
- For Delivery stream name, type VPCFlowLogsToElasticSearch (the name must match the default environment variable in the ingestion Lambda function). Choose Next.
- For Transform source records, choose Enabled.
- Choose vpc-flow-log-appender-dev-FlowLogDecoratorFunction-xxxxx from the Lambda function drop-down list (make sure you select the Decorator function). Choose Next.
- Choose Amazon Elasticsearch Service from Destination.
- Choose es-flowlogs from the Elasticsearch domain drop-down list. (The Amazon ES cluster configuration state must be Active for es-flowlogs to be available in the drop-down list.)
- For Index, type cwl.
- Choose Every day from the Index rotation drop-down list.
- For Type, type log.
- For S3 Backup, choose Failed Documents Only.
- For Backup S3 bucket, choose S3 bucket name from the drop-down list, or choose Create S3 bucket. Choose Next.
- Under IAM role, choose Create new, or Choose.
- Choose Allow. This takes you back to the Firehose Configuration.
- Choose Next, and then choose Create Delivery Stream.
Step 4: Stream data to Firehose
The next step is to enable the data to stream from CloudWatch Logs to Firehose. You will use the Lambda ingestion function you deployed earlier: vpc-flow-log-appender-dev-FlowLogIngestionFunction-xxxxxxx.
- In the AWS Management Console, choose CloudWatch under Management Tools.
- Choose Logs in the navigation pane, and select the check box next to Flowlogs under Log Groups.
- From the Actions menu, choose Stream to AWS Lambda. Choose vpc-flow-log-appender-dev-FlowLogIngestionFunction-xxxxxxx (select the Ingestion function). Choose Next.
- Choose Amazon VPC Flow Logs from the Log Format drop-down list. Choose Next.
- Choose Start Streaming.
VPC Flow Logs will now be forwarded to Firehose, capturing information about the IP traffic going to and from network interfaces in your VPC. Firehose appends additional data fields and forwards the enriched data to your Amazon ES cluster.
Data is now flowing to your Amazon ES cluster, but be patient because it can take up to 30 minutes for the data to begin appearing in your Amazon ES cluster.
Step 5: Verify that the flow log data is streaming through Firehose to the Amazon ES cluster
You should see VPC Flow Logs with ENI IDs under Log Streams (see the following screenshot) and Stored Bytes greater than zero in the CloudWatch log group.
Do you have logs from the Lambda ingestion function in the CloudWatch log group? As shown in the following screenshot, you should see START, END and REPORT records. These show that the ingestion function is running and streaming data to Firehose.
Do you have logs from the Lambda decorator function in the CloudWatch log group? You should see START, END, and REPORT records as well as entries similar to: “Processing completed. Successful records XXX, Failed records 0.”
Do you have cwl-* indexes in the Amazon ES dashboard, as shown in the following screenshot? If you do, you are successfully streaming through Firehose and populating the Amazon ES cluster, and you are ready to proceed to Step 6. Remember, it can take up to 30 minutes for the flow logs from your workloads to begin flowing to the Amazon ES cluster.
Step 6: Using the SGDashboard to analyze VPC network traffic
You now need set up a Kibana dashboard to monitor the traffic in your VPC.
To find the Kibana URL:
- In the AWS Management Console, click Elasticsearch Service under Analytics.
- Choose es-flowlogs under Elasticsearch domain name.
- Click the link next to Kibana, as shown in the following screenshot.
The first time you access Kibana, you will be asked to set the defaultindex. To set the defaultindex in the Amazon ES cluster:
- Set the Index name or pattern to cwl-*.
- For Time-field name, type @timestamp.
- Choose Create.
Load the SGDashboard:
- Download this JSON file and save it to your computer. The file includes a dashboard and visualizations I created for this blog post’s purposes.
- In Kibana, choose Management in the navigation pane, choose Saved Objects, and then import the file you just downloaded.
- Choose Dashboard and Open to load the SGDashboard you just imported. (You might have to press Enter in the top search box to have the dashboard load the first time.)
The following screenshot shows the SGDashboard after it has loaded.
The SGDashboard is composed of a set of visualizations. Each visualization contains a view or summary of the underlying data contained in the Amazon ES cluster, as shown in the preceding screenshot. You can control the timeframe for the dashboard in the upper right corner. By clicking the timeframe, the dashboard exposes alternative timeframes that you can select.
The SGDashboard includes a list of security groups, destination ports, source IP addresses, actions, protocols, and connection directions as well as raw VPC Flow Log records. This information is useful because you can compare this to your security group configurations. Ports might be open in the security group but have no network traffic flowing to the instances on those ports, which means the corresponding rules can probably be removed. Also, by evaluating IP ranges in use, you can narrow the ranges to only those IP addresses required for the application. The following screenshot on the left shows a view of the SGDashboard for a specific security group. By comparing its accepted inbound IP addresses with the security group rules in the following screenshot on the right, you can ensure the source IP ranges are sufficiently restrictive.
Analyze VPC Flow Logs data
Amazon ES allows you to quickly view and filter VPC Flow Logs data to determine what network traffic is flowing in your VPC. This analysis requires an understanding of security groups and elastic network interfaces (ENIs). Let’s say you have two security groups associated with the same ENI, and the first security group has traffic it will register for both groups. You will still see traffic to the ENI listed in the second security group because it is allowing traffic to the ENI. Therefore, when you click a security group that you want to filter, additional groups might still be on the list because they are included in the VPC Flow Logs records.
The following screenshot on the left is a view of the SGDashboard with a security group selected (sg-978414e8). Even though that security group has a filter, two additional security groups remain in the dashboard. The following screenshot on the right shows the raw log data where each record contains all three security groups and demonstrates that all three security groups share a common set of flow log records.
Also, note that security groups are stateful, so if the instance itself is initiating traffic to a different location, the return traffic will be displayed in the Kibana dashboard. The best example of this is port 123 Network Time Protocol (NTP). This type of traffic can be easily removed from the display by choosing the port on the right side of the dashboard, and then reversing the filter, as shown in the following screenshot. By reversing the filter, you can exclude data from the view.
Example: Unused security groups
Let’s say that some security groups are no longer in use. First, I change the time range by clicking the current time range in the top right corner of the dashboard, as shown in the following screenshot. I select Week to date.
As the following screenshot shows, the dashboard has identified five security groups that have had traffic during the week to date.
As you can see in the following screenshot, I have many security groups in my test account that are not in use. Any security groups not in the SGDashboard are candidates for removal.
Example: Unused inbound rules
Let’s take a look at security group sg-63ed8c1c from the preceding screenshot. When I click sg-63ed8c1c (the security group ID) in the dashboard, a filter is applied that reduces the security groups displayed to only the records with that security group included. We can compare the traffic associated with this security group in the SGDashboard (shown in the following screenshot) to the security group rules in the EC2 console.
As the following screenshot of the EC2 console shows, this security group has only 2 inbound rules: one for HTTP on port 80 and one for RDP. The SGDashboard shows that traffic is not flowing on port 80, so I can safely remove that rule from the security group.
It can be challenging to help ensure that your AWS Cloud environment allows only intended traffic and is as secure and manageable as possible. In this post, I have shown how to enable VPC Flow Logs. I then showed how to use Firehose and Lambda to add security group IDs, directions, and locations to the VPC Flow Logs dataset. The SGDashboard then enables you to analyze the flow log data and compare it with your security group configurations to improve your cloud security.
If you have comments about this blog post, submit them in the “Comments” section below. If you have implementation or troubleshooting questions about the solution in this post, please start a new thread on the AWS WAF forum.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.