AWS Security Blog
How to Optimize and Visualize Your Security Groups
September 9, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
May 3, 2017: We published a related blog post also written by Guy Denney, How to Visualize and Refine Your Network’s Security by Adding Security Group IDs to Your VPC Flow Logs.
Many organizations start their journey with AWS by experimenting with existing applications. Those experiments may include trying to move an application to the cloud. To move an application successfully, you need to know the network ports, protocols, and IP addresses necessary for it to function. Although you can use AWS security groups to restrict access to ports and protocols in your Amazon Virtual Private Cloud (Amazon VPC), many developers determine these rules via trial and error, often resulting in overly permissive security groups.
When the experiment is complete and an application is finally functional, some organizations do not go back to narrow their security group rules to only include the necessary network ports, protocols, and IP addresses. This creates a less than optimal security posture.
In this blog post, I will present a method that uses network data to optimize and visualize your security groups.
Overview
Removing unused rules or limiting source IP addresses requires either an in-depth knowledge of an application’s active ports on the instances, or analysis of active network traffic. The method described in this post can help you remediate security groups to only necessary source IPs, ports, and nested security groups. This can improve the security stance of your AWS resources while minimizing the potential impact to production instances. Here are the basic steps:
- Use VPC Flow Logs and Amazon Elasticsearch Service (Amazon ES) to capture information about the IP traffic in an Amazon VPC.
- Associate the network traffic with an elastic network interface (ENI), instances, and security groups.
- Demonstrate how to visualize and analyze network traffic from VPC Flow Logs by using Amazon ES.
Step 1: The setup
Create an Amazon ES cluster
The first step in the process is to create an Amazon ES cluster. Create the cluster first because it will take time for it to be available. If you are new to Amazon ES, you can learn more about it in the Amazon ES documentation.
To create an Amazon ES cluster:
- In the AWS Management Console, click Elasticsearch Service under Analytics.
- Click Create a new domain. Type flowlogs for the Elasticsearch domain name.
- Set Instance count to 2 and select the Enable zone awareness check box. (This ensures cluster stability if an Availability Zone outage occurs.) Accept the defaults for the rest of the page. Click Next.
- From the drop-down on the next page, select Allow access to the domain from specific IP(s).
- In the dialog box, type or paste the comma-separated list of valid IPv4 addresses or CIDR blocks you would like to be able to access the Amazon ES domain. For more information, see Configuring Access Policies. Click Next.
- On the next page click Confirm and create.
The cluster will be available in a few minutes. In the meantime, you can start the next step of the process, which is to enable VPC Flow Logs.
Enable VPC Flow Logs
VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data is stored using Amazon CloudWatch Logs. For more information about VPC Flow Logs, see VPC Flow Logs.
To enable VPC Flow Logs:
- In the AWS Management Console, click VPC under Networking.
- Click Your VPCs (as shown in the following screenshot), and select the VPC you would like to analyze. (You can also enable VPC Flow Logs on only a subnet, if you do not want to enable it on the entire VPC.)
- Click the Flow Logs tab in the bottom pane.
- Click Create Flow Log. If this is the first time you have set up VPC Flow Logs in this account, you must click Set Up Permissions. This will open a new tab in your browser.
- For IAM Role, choose Create a new IAM Role.
- To establish the Role Name, type flowlogsRole.
- Click Allow. Close the tab and navigate back to the Create Flow Log dialog box from Step 4.
- Now you can select the Role flowlogsRole and set the Destination Log Group to FlowLogs. Click Create Flow Logs.
The VPC Flow Logs data is now streaming to CloudWatch Logs. The next step is to enable the data to stream from CloudWatch Logs to the Amazon ES cluster. You can accomplish this through a built-in Lambda function.
To flow data to your Amazon ES cluster:
- In the AWS Management Console, select CloudWatch under Management Tools.
- Click Logs in the left pane and select the check box next to FlowLogs under Log Groups.
- From the Actions menu at the top of the page, select Stream to Amazon Elasticsearch Service.
- Select the Amazon ES Cluster name flowlogs from the drop-down.
- For Lambda IAM Execution Role, select Create new IAM role.
- In the dialog box, click Allow. Then click Next.
- For Log Format, select Amazon VPC Flow Logs from the drop-down. Click Next.
- Click Next again, and then click Start Streaming.
VPC Flow Logs will now begin capturing information about the IP traffic going to and from network interfaces in your VPC, and stream that information to your Amazon ES cluster. Data is now flowing to your Amazon ES cluster, but the Amazon ES cluster is making some assumptions about the format of the data. The next step of the process is to provide formatting information to Amazon ES that is more explicit and then remove any data in the Amazon ES cluster that is not in the correct format.
Format data in the ES cluster
A flow log record is a space-separated string that has the following format.
version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
By default, Amazon ES assumes that dashes and periods in fields are separators. This causes results to be returned twice, which clutters the dashboard with partial results. To correct this behavior, we must first set interface-id, srcaddr, and dstaddr to not_analyzed by running the curl command from a shell prompt. Before accessing your Amazon ES cluster, you should review your access policy and security approach. For more information, see Securing Your Elasticsearch Cluster.
The curl command is available on Mac OS and Amazon Linux AMI, and on Windows. Be sure to replace the placeholder value with your Amazon ES domain endpoint here and elsewhere in this post. For more information about how to run commands on your Amazon ES cluster, see Talking to Elasticsearch.
curl -XPUT "http://YOUR_ES_DOMAIN_ENDPOINT/_template/template_1" -d'
{
"template":"cwl-*","mappings":{
"FlowLogs": {
"properties": {
"interface_id": { "type": "string", "index": "not_analyzed"},
"srcaddr": { "type": "string", "index": "not_analyzed"},
"dstaddr": { "type": "string", "index": "not_analyzed"}
}
}
}
}'
After running the preceding command, remove the old data from the cluster to clear data that was indexed incorrectly. Do this by executing the following delete command.
curl -XDELETE 'http://YOUR_ES_DOMAIN_ENDPOINT/cwl*/'
Import dashboards and visualizations
Now, network traffic for your VPC is flowing in to your Amazon ES cluster. To visualize and search the data, I will use a tool built into Amazon ES called Kibana. I have created a dashboard you can use to import into your Amazon ES cluster to simplify and speed up your implementation.
You import dashboards and visualizations by using the curl command against the Amazon ES cluster endpoint. However, some customers find it simpler to use a handy tool to manage the saving and copying of data with Amazon ES, such as elasticdump. (If you don’t already have npm installed, you must install the npm package manager. For more information, go to What is npm?)
After you have installed elasticdump, run the following command. (Again, be sure to replace the placeholder value with your Amazon ES domain endpoint.)
elasticdump --input=SGDashboard.json --output=http:// YOUR_ES_DOMAIN_ENDPOINT/.kibana-4
You now have a dashboard to monitor the traffic in your VPC.
To find the Kibana URL:
- In the AWS Management Console, click Elasticsearch Service under Analytics.
- Click flowlogs under My Elasticsearch domains.
- Click the link next to Kibana, as shown in the following screenshot.
- Click the Dashboards tab and open FlowLogDash (as shown in the following screenshot).
You will see the Kibana FlowLogDash (as shown in the following screenshot).
Step 2: Associating ENIs with security groups
The remediation of security groups, though, is more complicated. The VPC Flow Logs only capture the ENI for the traffic, so you must associate the ENIs with their related security groups.
This is where API script “magic” comes in. You must have the Amazon Command Line Interface and jq installed on the same computer to run this Bash script (create a separate subdirectory for downloading and running the script). The script queries the Amazon API to discover the associations between ENIs and security groups. It then builds the list of security groups with links to the Kibana dashboard, which filters the results.
Use the following command to change the file permissions, so you can execute the script.
chmod 744 sgremediate.sh
Edit the script to add your VPC ID and Kibana endpoint (the following screenshot shows placeholders for both values).
Now you can run the Bash script and send the output to an HTML file by using the following command.
sgremediate.sh > index.html
An example of the resulting file is shown in the following screenshot. The file will be a list of your security groups with links to the Kibana dashboard. The links contain information necessary to filter the dashboards to the traffic that is associated with the security group and flowing to the underlined instances.
If you click the links in the index.html file, you will return to the Kibana dashboard and see only information relative to the security group under review. Let’s first review the dashboard and how to interpret its information.
Step 3: Using the FlowLogDash dashboard
The FlowLogDash dashboard is composed of a set of visualizations. Each visualization contains a view or summarization of the underlying data in the Amazon ES cluster contains, as shown in the preceding screenshot. You can control the time frame for the dashboard in the upper right corner (see the following screenshot). By clicking the timeframe, the dashboard exposes alternative timeframes that can be selected. If you click the small arrow at the bottom of the page, you will collapse the time frame view.
On the FlowLogDash dashboard, the left side is divided into three sections. The top section is a list of the ENIs, a count of records, and the sum of bytes. The middle pie chart shows the percentages of accept and reject actions. The bottom pie chart shows relative percentages of protocols for the flow log data.
In the middle pane of the dashboard is a large pie chart that displays the source IP address, protocol, destination port, and destination IP address of the network traffic flowing in the VPC. These fields map to the security group’s Inbound rules tab in the AWS Management Console.
On the right side of the FlowLogDash dashboard is a list of destination ports and below it are the raw VPC Flow Log records. This information is useful because ports can be open in the security group but have no network traffic flowing to the instances on those ports. The corresponding rules probably can be removed.
Visualize and analyze VPC network traffic
Amazon ES allows you to view and filter VPC Flow Log data to determine what network traffic is flowing inside your VPC. Amazon ES can assist in narrowing ports or IP source addresses in security groups to improve your organization’s security stance.
The sgremediate.sh script I mentioned previously queries the AWS APIs, produces a list of security groups, and builds a link to the Kibana FlowLogDash dashboard, which automatically filters the results for all ENIs associated with a security group. Because VPC Flow Logs record traffic in both directions, the script also excludes the primary private IP from the results to clean up the dashboard clutter. After you click the link in the index.html file, you can see the filtered results in the search window, indicated by the red arrow in the following screenshot. You can remove or edit the text in the search box to customize the query.
Keep in mind that an ENI may be associated with two or more security groups. Let’s say you have two security groups associated with the same ENI, and one of the security groups has traffic it will register for both groups. You will still see traffic to the ENI listed in the second security group because it is allowing traffic to the ENI.
Also, keep in mind that security groups are stateful, so if the instance itself is initiating traffic to a different location, the Kibana dashboard will display the return traffic. The best example of this is port 123 NTP. To remove this traffic from the display, select the port on the right side of the dashboard, and then reverse the filter, as shown in the following screenshot. By reversing the filter, you can exclude data from the view.
Summary
To ensure that your AWS cloud environment is secure, maintainable, and only allows intended traffic can be a challenging task. By using VPC Flow Logs and Amazon ES together with Kibana dashboards, you can visualize and better optimize control over your security groups and your cloud security.
If you have comments about this blog post, please submit them in the “Comments” section below. If you have questions about this solution or its implementation, contact your AWS account support team or start a new thread on the AWS WAF forum.
– Guy
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.