Networking & Content Delivery

Using AWS Transit Gateway Flow Logs to chargeback data processing costs in a multi-account environment

Many AWS customers use consolidated billing, and often need to allocate costs across their internal business units or accounts. This can be challenging when dealing with services that are shared by all accounts. For general chargebacks, some customers use cost allocation tags for this purpose. However, at the time of writing this post, there is no native way to tag network traffic. You can add tagging metadata to VPC Flow Logs as described in this blog post on enriching flow logs with resource tags. It requires some overhead and is not integrated with AWS Cost Explorer or AWS Cost and Usage Reports (AWS CUR) for chargebacks.

In addition to using tags, many customers make use of multiple accounts to simplify how they allocate AWS costs. They use the accounts to identify which projects or services are responsible for AWS charges. For such customers, in the How-to chargeback shared services: An AWS Transit Gateway example blog post, we looked at how to allocate costs for a transit gateway using Amazon CloudWatch metrics and AWS CUR. This post builds on that post and shows you how to create a proportional cost allocation model for the AWS Transit Gateway data processing charges incurred in a networking account using Transit Gateway Flow Logs data in conjunction with AWS CUR data.

In particular, we look at a scenario where you have a shared networking services Amazon Virtual Private Cloud (VPC) belonging to a networking account that is attached to the transit gateway. After network inspection, data is sent from the centralized networking account to spoke VPCs in the same or different accounts that have attachments to the transit gateway. The data processing charges incurred by the centralized networking account need to be charged back to the spoke accounts. This same method is used to perform a chargeback for data processing charges for any attachment to the transit gateway that is shared by multiple accounts, for instance, a shared AWS Direct Connect or AWS Site-to-Site VPN attachment.

Figure 1 shows an architecture diagram of a multi-account environment using a centralized VPCs in a networking account that handles traffic inspection and egress to the internet.

Centralized architecture with a collapsed inspection/egress VPC
Figure 1: Centralized architecture with a collapsed inspection/egress VPC

The AWS Transit Gateway pricing model

At the time of writing this post, the pricing model for Transit Gateway consists of two components: the number of connections that you attach to the Transit Gateway per hour and the amount of data that is processed through Transit Gateway. Data processing is charged to the account that owns the attachment and sends the traffic to the transit gateway.

Note: When referring to the resource, this post uses lowercase “transit gateway.” We capitalize “Transit Gateway” when referring to the service.

There are several scenarios in centralized architecture designs where a chargeback model is needed.

Scenario 1: Centralized Egress VPC for north-south traffic from spoke VPCs to the internet

In this scenario, all egress traffic from the spoke VPCs is routed through a centralized egress VPC. Traffic sent from each spoke VPC and processed by the transit gateway is charged to the spoke account. However, return traffic sent from the centralized egress VPC and processed by the transit gateway is charged to the networking account.

Scenario 2: Centralized inspection VPC for east-west traffic from spoke VPCs

In this scenario, return traffic from the centralized inspection VPC that is processed by the transit gateway before being routed to the destination VPC is charged to the networking account.

Scenario 3: Centralized inspection/egress VPC for east-west and north-south traffic from spoke VPCs

This scenario combines the inspection and egress into a single collapsed VPC that carries out both functions. As such, the same principle applies that return traffic processed by the transit gateway before being routed to the destination VPC is charged to the networking account.

Scenario 4: Ingress traffic on shared Direct Connect or Site-to-Site VPN attachments:

While there is no data transfer in cost for inbound traffic on both Direct Connect and VPN connections, there is a charge incurred by Transit Gateway in the networking account for processing the inbound traffic over the attachments before it is routed to the spoke accounts.

These scenarios are all use cases where the cost may need to be charged back to the spoke VPC account instead of the centralized networking account.

Prerequisites

In order to calculate and allocate costs, you must:

  1. Enable Transit Gateway Flow Logs

The Transit Gateway Flow Logs feature forms the basis on which an estimate of how much network traffic can be allocated to each account will be done. You can follow these steps in the AWS Transit Gateway documentation to enable flow logs in transit gateways. When executing these steps, here are some things to note:

  • In this post, the flow logs are published to an Amazon Simple Storage Service (Amazon S3) bucket, since we will query the data using Amazon Athena. Hence, set the destination for the logs to Amazon S3. You should use Amazon CloudWatch Logs or Amazon Kinesis Data Firehose if you prefer to analyze the logs using other methods.
  • This post assumes you have a single transit gateway. Suppose you have multiple transit gateways in the centralized account and in the same or different AWS Regions. In that case, you must enable the flow logs for each transit gateway and send them to a single S3 bucket to consolidate all the data there for the analysis described in this post.
  • A common practice when enabling the flow logs for multiple transit gateways is to specify a prefix within your S3 bucket ARN to match your transit gateway ID. While this is good, we recommend against using any S3 prefixes for the chargeback process described in this post. This is because you will run the calculations against your total cost billed to the centralized account rather than the cost per transit gateway. Therefore, the Athena table that will be created for the transit gateway logs would be for the entire bucket and not any specific prefix, unless you prefer to do the chargeback for each individual transit gateway.
  • For the log record format, use the default format. You can choose the custom format if you only want specific flow log fields or prefer a different order for the fields. If you do, you must create your Athena table differently from the following example given.
  • Set the format to Parquet because this reduces storage space consumed in Amazon S3 and also improves query times.
  • If the amount of traffic processed by the transit gateway is large, partition your flow logs per hour.

Here’s a sample query for creating the Athena table for the transit gateway flow logs in the Amazon Athena console query editor:

CREATE EXTERNAL TABLE `tgwflowlogs`(
    `version` int,
    `resource_type` string,
    `account_id` string,
    `tgw_id` string,
    `tgw_attachment_id` string,
    `tgw_src_vpc_account_id` string,
    `tgw_dst_vpc_account_id` string,
    `tgw_src_vpc_id` string,
    `tgw_dst_vpc_id` string,
    `tgw_src_subnet_id` string,
    `tgw_dst_subnet_id` string,
    `tgw_src_eni` string,
    `tgw_dst_eni` string,
    `tgw_src_az_id` string,
    `tgw_dst_az_id` string,
    `tgw_pair_attachment_id` string,
    `srcaddr` string,
    `dstaddr` string,
    `srcport` int,
    `dstport` int,
    `protocol` bigint,
    `packets` bigint,
    `bytes` bigint,
    `start` bigint,
    `end` bigint,
    `log_status` string,
    `type` string,
    `packets_lost_no_route` bigint,
    `packets_lost_blackhole` bigint,
    `packets_lost_mtu_exceeded` bigint,
    `packets_lost_ttl_expired` bigint,
    `tcp_flags` int,
    `aws_region` string,
    `flow_direction` string,
    `pkt_src_aws_service` string,
    `pkt_dst_aws_service` string
)
PARTITIONED BY (`region` string, `day` string) 
STORED AS PARQUET 
LOCATION 's3://DOC-EXAMPLE-BUCKET/prefix/AWSLogs/account_id/vpcflowlogs/' 
TBLPROPERTIES ( 
    'projection.day.format' = 'yyyy/MM/dd',
    'projection.day.range' = '2021/01/01,NOW',
    'projection.day.type' = 'date',
    'projection.enabled' = 'true',
    'projection.region.type' = 'enum',
    'projection.region.values' = 'us-east-1,us-west-2,ap-south-1,eu-west-1',
    'skip.header.line.count' = '1',
    'storage.location.template' = 's3://DOC-EXAMPLE-BUCKET/prefix/AWSLogs/account_id/vpcflowlogs/${region}/${day}'
)

Be sure to perform the following tasks:

  • Replace ‘tgwflowlogs’ in the query with the name of your table.
  • Modify the location parameter in the query to point to the S3 bucket that contains your log data without specifying any prefix. Replace DOC-EXAMPLE-BUCKET/prefix with the bucket name you created. Replace account_id with the account ID that owns the S3 bucket that stores the transit gateway flow logs.

To learn more about using Athena for querying flow logs, you can refer to our explanation in the Knowledge Center that describes how to query VPC flow logs using Athena. For more information on the Transit Gateway flow log fields, which differ from VPC flow logs, please see the AWS Transit Gateway documentation.

  1. Set up your Cost and Usage Report

While you may want to calculate the number of bytes from Transit Gateway and multiply by the cost per gigabyte (GB), remember that on rare occasions, Transit Gateway flow logs may skip data if there are capacity constraints or errors. For a more accurate chargeback strategy, we recommend that you allocate costs that appear in your Cost and Usage Report (CUR). To get this, you can create your Cost and Usage Reports and set up Athena to utilize your Cost and Usage Reports.

Note: It can take up to 24 hours for AWS to start delivering the CUR reports to your S3 bucket. You may request a backfill of your cost data for previous months, as explained in our documentation.

Steps to get the AWS Transit Gateway data processing charge per account

To explain the steps, we’ll present an example of a multi-account environment with multiple spoke attachments to a transit gateway. The transit gateway also has an attachment to a centralized VPC, which offers network traffic inspection as a service to the workload accounts. The transit gateway and centralized VPC is in a networking or shared-services account (ending in *5157), thus resulting in data processing charges for traffic sent from the centralized VPC to VPCs in the spoke accounts receiving the processed data.

For these queries mentioned in the steps below, please keep note of these;

<<Database>> will be the name of the Athena Database. This is important if you choose to create multiple databases, and create the tables for the Transit Gateway and the CUR in separate databases.
<<TGW Table_name>> will be the name of the Transit Gateway Table that you created during the “Create Transit Gateway Flow Logs” section above
<<CUR Table_name>> will be the name of the CUR table that you created during the “Set up your Cost and Usage Report” section above
<<Start_Date>> will be the first day of the month under consideration, in the format ‘yyyy/mm/dd’
<<End_Date>> will be the last day of the month under consideration, in the format ‘yyyy/mm/dd’
<<egress_vpc_id>> will be the VPC ID of the Egress VPC
<<centralized_vpc_account_id>> will be the account ID that owns the centralized VPC
<<MM>> will be the month of the billing period under consideration, that aligns with the Start and End dates
<<YYYY>> will be the year of the billing period under consideration, that aligns with the Start and End dates

Here are the steps:

1) Decide on a cost allocation strategy

In order to perform the correct calculations, you must decide how you want to allocate the costs. One way is to split the costs incurred by the networking account equally. Or, you can choose to do it proportionally, so that accounts are charged for what they use. In this post, we distribute the cost proportionally, based on the percentage of data received by each of the spoke accounts from the networking account. This way, the spoke accounts can be billed for the traffic they consume from the transit gateway. Accounts that consume a higher percentage get billed for their consumption.

2) Calculate the total network traffic and percentage allocation per account

Using Athena, we can query the transit gateway flow logs to get a breakdown of the egress traffic from the networking accounts to the different spoke accounts.

The following Athena query calculates the total Transit Gateway data processing bytes from the centralized VPC in the networking account, along with the amount of data in GB (and percentage) that was sent to each spoke account.

select     /* This query is fetching centralized account ID,spoke account Id,data processed/sent to each spoke account and percentage wise allocation of total data processed */
      tgw_src_vpc_account_id, 
      tgw_dst_vpc_account_id,
      cast(sum(bytes)/power(1024,3) as decimal(38,3)) as "account_specific_data_processed(GB)", /*To convert per account bytes to GB*/
      cast(sum(bytes)as decimal(38,4))/cast(
                    (select sum(bytes) as total_bytes  /* Inner query to fetch total data processed(in bytes) from centralized account */
                    FROM "<<Database>>"."<<TGW Table_name>>"
                    WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>' and flow_direction= 'egress' 
                    ) as decimal(38,4))*100 as "account_specific_data_processed_percentage(%)" /*To calculate % wise allocation of data processed for each spoke account */
FROM "<<Database>>"."<<TGW Table_name>>"
WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>' AND flow_direction= 'egress'/*filter records by log status and flow direction egress*/
GROUP BY tgw_src_vpc_account_id, tgw_dst_vpc_account_id

Figure 2 shows an example output for the data processed by the Transit gateway and sent from the networking account to each spoke account and the percentage of the total.

Data processed and percentage allocation from Transit Gateway Flow Logs
Figure 2: Data processed and percentage allocation from Transit Gateway Flow Logs

Note that if you use a shared VPC across multiple accounts, only the spoke account that owns the transit gateway attachment will be reflected in the flow logs as the receiving account.

3) Calculate the total cost of the shared service

From your Cost and Usage Report, you can calculate the total data processing costs that must be charged back. The following is a sample Athena query to calculate the total cost of AWS Transit Gateway data processing costs charged to the networking account.

select CAST(sum(line_item_unblended_cost) as DECIMAL(38,3)) AS tgw_total_unblended_cost /*Inner query to fetch transitgateway charges for payer account from CUR. This will be used to calculate chargeback cost for spoke accounts */
        FROM 
        "<<Database>>"."<<CUR Table_name>>"
        WHERE month(bill_billing_period_start_date) = <<MM>> AND year(bill_billing_period_start_date) = <<YYYY>> AND product_group = 'AWSTransitGateway' AND line_item_line_item_type  IN ('DiscountedUsage', 'Usage', 'SavingsPlanCoveredUsage')
        AND pricing_unit = 'GigaBytes' AND line_item_usage_account_id=bill_payer_account_id
        GROUP BY
        bill_payer_account_id

Figure 3 shows an example output of the total data processing costs incurred by the networking account as calculated from the Cost and Usage report.

Total data processing costs for networking account from CUR
Figure 3: Total data processing costs for networking account from CUR

4) Estimate cost per account

Using the results from Steps 2 and 3, we can now estimate how much of the total data processing cost to be charged back. We take the percentage of network traffic per spoke account calculated in Step 2 and multiply it by the total cost calculated in Step 3: % of network traffic usage x total cost from CUR = chargeback cost per usage account Alternatively, the following Athena query combines the actions in Step 2 to Step 4 into a single query.

select *,
CAST((select CAST(sum(line_item_unblended_cost) as DECIMAL(38,3)) AS sum_line_item_unblended_cost /*Inner query to fetch transitgateway charges for centralized account from CUR. This will be used to calculate chargeback cost for spoke accounts */
        FROM 
        "<<Database>>"."<<CUR Table_name>>"
        WHERE month(bill_billing_period_start_date) = <<MM>> AND year(bill_billing_period_start_date) = <<YYYY>> AND product_group = 'AWSTransitGateway' AND line_item_line_item_type  IN ('DiscountedUsage', 'Usage', 'SavingsPlanCoveredUsage')
        AND pricing_unit = 'GigaBytes' AND line_item_usage_account_id=bill_payer_account_id
        GROUP BY
        bill_payer_account_id)*"account_specific_data_processed_percentage(%)"/100 as decimal(38,3)) as "chargeback_cost_per_account($)"
FROM (select     /* This inner query is fetching centralized account ID,spoke account Id,data processed/sent to each spoke account and percentage wise allocation of total data processed */
      tgw_src_vpc_account_id, 
      tgw_dst_vpc_account_id,
      cast(sum(bytes)/power(1024,3) as decimal(38,3)) as "account_specific_data_processed(GB)", /*To convert per account bytes to GB*/
      cast(sum(bytes)as decimal(38,4))/cast(
                    (select sum(bytes) as total_bytes  /* Inner query to fetch total data processed(in bytes) from centralized account */
                    FROM "<<Database>>"."<<TGW Table_name>>"
                    WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>' and flow_direction= 'egress' 
                    ) as decimal(38,4))*100 as "account_specific_data_processed_percentage(%)" /*To calculate % wise allocation of data processed for each spoke account */
FROM "<<Database>>"."<<TGW Table_name>>"
WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>'  AND flow_direction= 'egress' /*filter records by log status and flow direction egress*/
GROUP BY tgw_src_vpc_account_id, tgw_dst_vpc_account_id)

Figure 4 shows an example output of the data processed by the Transit gateway and sent from the networking account to each spoke account. It also shows the percentage of the total data processed and the final chargeback amounts for each spoke account.

Chargeback costs per account
Figure 4: Chargeback costs per account

We recommend you make use of the cost from the CUR and split it by the byte percentage as shown in the previous image, rather than multiplying the price per GB of data processed (in USD in our example) by the total number of egress bytes to get the chargeback cost. This is because Transit Gateway flow logs may have records that are skipped during the aggregation interval because of an internal capacity constraint or an internal error. The cost from the CUR will be the most accurate data because the billing systems will not have such constraints. For more information, refer to the AWS Transit Gateway documentation on available fields in the flow logs.

You can go one step further and create a dashboard using Amazon Quicksight if you want to create visualizations for this.

The queries in this blog post are samples. You can modify them or create new ones to carry out chargebacks in line with your infrastructure and business needs. For example, if you have a separate egress VPC from your inspection VPC (a variation of the scenario we considered in step 4), the networking account will incur data processing costs for inspected spoke traffic that leaves your inspection VPC to the egress VPC. You may opt to exclude the traffic to the egress VPC in order to eliminate those charges from your networking account. Hence, you can exclude the VPC IDs for the egress VPC in your query as shown in the following code snippet.

select *,
CAST((select CAST(sum(line_item_unblended_cost) as DECIMAL(38,3)) AS sum_line_item_unblended_cost /*Inner query to fetch transitgateway charges for centralized account from CUR. This will be used to calculate chargeback cost for spoke accounts */
        FROM 
        "<<Database>>"."<<CUR Table_name>>"
        WHERE month(bill_billing_period_start_date) = <<MM>> AND year(bill_billing_period_start_date) = <<YYYY>> AND product_group = 'AWSTransitGateway' AND line_item_line_item_type  IN ('DiscountedUsage', 'Usage', 'SavingsPlanCoveredUsage')
        AND pricing_unit = 'GigaBytes' AND line_item_usage_account_id=bill_payer_account_id
        GROUP BY
        bill_payer_account_id)*"account_specific_data_processed_percentage(%)"/100 as decimal(38,3)) as "chargeback_cost_per_account($)" 
FROM (select     /* This query is fetching centralized account ID,spoke account Id,data processed/sent to each spoke account and percentage wise allocation of total data processed */
      tgw_src_vpc_account_id, 
      tgw_dst_vpc_account_id,
      cast(sum(bytes)/power(1024,3) as decimal(38,3)) as account_specific_data_processed_gb, /*To convert per account bytes to GB*/
      cast(sum(bytes)as decimal(38,4))/cast(
                    (select sum(bytes) as total_bytes  /* Inner query to fetch total data processed(in bytes) from centralized account */
                    FROM "<<Database>>"."<<TGW Table_name>>"
                    WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>' AND flow_direction= 'egress' AND tgw_src_vpc_id <> '<<egress_vpc_id>>' AND tgw_dst_vpc_id <> '<<egress_vpc_id>>') as decimal(38,4))*100 as "account_specific_data_processed_percentage(%)" /*To calculate % wise allocation of data processed for each spoke account */
FROM "<<Database>>"."<<TGW Table_name>>"
WHERE day >= '<<Start_Date>>' AND day <= '<<End_Date>>' AND "log_status" = 'OK' AND tgw_src_vpc_account_id='<<centralized_vpc_account_id>>' AND flow_direction= 'egress' AND tgw_src_vpc_id <> '<<egress_vpc_id>>' AND tgw_dst_vpc_id <> '<<egress_vpc_id>>' /*filter records by log status,flow direction egress, egress vpc*/
GROUP BY tgw_src_vpc_account_id, tgw_dst_vpc_account_id)

The following screenshot (figure 5) shows an example using the same data sets but excluding the data sent to a separate egress VPC.

Chargeback cost per account (minus egress VPC)
Figure 5: Chargeback cost per account (minus egress VPC)

As you can see, the cost charged to the networking account ending in *5157 is reduced slightly, and the allocation to the other accounts is increased.

Extending calculations for other shared network services

When you have a centralized inspection, egress, or ingress VPC, you may have other shared services in that VPC, such as NAT Gateway, Gateway Load Balancer, or VPC endpoints, which may have associated data processing costs. The same approach could be considered to estimate the traffic contributions of each spoke account to the total amount of bytes being processed inside the VPC. You can create a SQL query to sum up all ingress and egress traffic on the transit gateway, group it appropriately, and apply the same steps in this post. You could calculate the percentage byte allocation from the transit gateway flow logs and the cost of the shared services from your Cost and Usage Report and multiply both to determine how much to charge the individual spoke accounts.

Cleanup

In order to prevent unwanted charges to your AWS account after completing the steps in this post, any resources you created from this blogpost that you no longer need can be deleted. You will incur charges for the logs stored in Amazon S3. To eliminate this cost, you can delete the Transit Gateway flow logs, empty the S3 bucket, and delete the bucket.

Conclusion

In this post, we outlined an approach to calculate chargeback for data processing charges incurred by your centralized networking account for a shared transit gateway attachment. Specifically, the approach we used here allocates data processing charges incurred by the shared transit gateway attachment to the spoke accounts that received the processed data. However, the same method can be used to chargeback data processing costs for any attachment to the transit gateway that is shared by multiple accounts.

Victor.jpg

Opeoluwa Victor Babasanmi

Victor is a Sr. Technical Account Manager at AWS. He focuses on providing customers with technical guidance on planning and building solutions using best practices, and proactively keeps their AWS environments operationally healthy. When he is not helping customers, you may find him playing soccer, working out, or looking for a new adventure somewhere.

Rashmiman.jpg

Rashmiman Ray

Rashmiman is a Technical Account Manager at AWS, based out of New Jersey. He works with AWS Enterprise customers, providing technical guidance and best practice recommendations to help them succeed in the cloud. Outside of work, he enjoys hiking on trails, playing cricket, and cooking Indian delicacies.

Tega.jpg

Tega Odjegba

Tega is a Technical Account Manager at AWS based out of Stockholm, Sweden. Prior to this role, he worked in multiple roles on Network Infrastructure design and operations across different countries in Africa. He’s passionate about technology, and helping customers architect, build, and operate their solutions efficiently on AWS. Outside of work, you’ll find him training for long-distance races, watching movies or traveling to visit friends and family.