How can I find the top talkers or contributors to traffic through the NAT gateway in my VPC?

Last updated: 2022-04-01

I noticed higher than usual costs in my AWS bill for a NAT gateway in my Amazon Virtual Private Cloud (Amazon VPC). How can I find the top contributors to traffic through the NAT gateway in my VPC?

Resolution

Note: In each of the following commands, replace x.x.x.x with the private IP of your NAT gateway. Replace y.y. with the first two octets of the VPC CIDR range.

Confirm that you have VPC Flow Logs turned on your VPC or NAT gateway elastic network interface. Create a flow log to turn on VPC Flow Logs, if necessary. You can publish flow log data to Amazon CloudWatch Logs or Amazon Simple Storage Solution (Amazon S3).

To query in CloudWatch logs

1.    Open the CloudWatch console.

2.    In the navigation pane, choose Logs Insights.

3.    From the dropdown list, select the log group for your NAT gateway.

4.    To find the instances that are sending the most traffic through your NAT gateway, run the following query:

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') 
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

5.    To find traffic going to and from the instances, run the following query:

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') or (srcAddr like 'x.x.x.x' and dstAddr like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

6.    To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.

For uploads:

filter (srcAddr like 'x.x.x.x' and dstAddr not like 'y.y.') 
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

For downloads:

filter (dstAddr like 'x.x.x.x' and srcAddr not like 'y.y.') 
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

To query logs in an S3 bucket using Athena

Either use the Amazon VPC console or the Amazon Athena console to create a table. In this example, default is the database and vpc_flow_logs is the table.

1.    To find the instances that are sending the most traffic through your NAT gateway, run the following query:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x' group by 1,2 order by 3 desc
limit 10;

2.    To find traffic going to and from the instances, run the following query:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x') or (srcaddr like 'x.x.x.x' AND dstaddr like 'y.y.%') group by 1,2 order by 3 desc
limit 10;

3.    To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.

For uploads:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'x.x.x.x' AND dstaddr not like 'y.y.%') group by 1,2 order by 3 desc
limit 10;

For downloads:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr not like 'y.y.%' AND dstaddr like 'x.x.x.x') group by 1,2 order by 3 desc
limit 10;