AWS Cost Management

Cost Allocation Blog Series #4: Visualize Data Transfer Costs with Cost and Usage Reports, Athena, and QuickSight

The blog post is contributed by Daniel Yu, Senior Technical Account Manager of AWS Enterprise Support, and Vishal Srivastava, Associate Technical Account Manager

Organizations tend to overlook shared costs, such as, data transfer costs, as it can be difficult to identify owners and enforce necessary actions.  However, as we recommended in the blog “Cost Allocation Basics That You Need to Know”, there should be an internal alignment on how to allocate these costs, so teams are aware of and accountable for these types of shared costs.  The right level of visibility into the shared costs allows organizations to understand the main drivers for these costs and provides insights into cost optimization opportunities. AWS Cost and Usage Reports generate and deliver comprehensive cost and usage information to the Amazon S3 Bucket of your choice.  You can integrate the data with Amazon Athena and conduct quick queries.  You can also visualize the cost and usage information in Amazon QuickSight.

In this blog post, Daniel Yu, Senior Technical Account Manager of AWS Enterprise Support, and Vishal Srivastava, Associate Technical Account Manager will share tips on how you can visualize data transfer cost details available in AWS Cost and Usage Reports (AWS CUR) for analysis using Amazon Athena and Amazon QuickSight.  AWS data transfer charges are incurred when data is transferred out from AWS services to the Internet, or between AWS regions or Availability Zones.  Visualization of the AWS CUR data can help you monitor and quickly identify trends and top contributors for the data transfer costs.

Solution Overview

We will be using Amazon QuickSight to visualize the Amazon Athena analysis of the AWS Cost and Usage Reports.

Walk Through

Step 1: Create an Amazon Athena view for the data transfer cost details available in AWS Cost and Usage Reports

Step 2: Create three pieces of Amazon QuickSight visuals for data transfer cost analysis

Prerequisites

You must set up AWS Cost and Usage Reports integration with Amazon Athena.  See the documentation for help on setting up Amazon Athena integration.  The AWS CloudFormation template will create AWS Identity and Access Management (IAM) roles, AWS Glue database, AWS Glue crawler, AWS Lambda functions, and Amazon Simple Storage Service (Amazon S3) notification as components for AWS CUR integration with Amazon Athena.  The blog Query your AWS Cost and Usage Report using Amazon Athena have details on the CloudFormation template.  When creating the AWS Cost and Usage Reports, the following options must be selected:

  • Include resource IDs
  • Enable report data integration for Athena: parquet compression

You must configure Amazon QuickSight access to Amazon Athena and Amazon S3 bucket storage of AWS Cost and Usage Reports.  See the QuickSight documentation for help on Setting up Amazon QuickSight for an existing AWS user and Using Other AWS Services: Scoping Down Access.

Create an Amazon Athena view for the data transfer cost details in AWS Cost and Usage Reports

After the AWS Cost and Usage Reports integration with Amazon Athena, you can begin running SQL queries against the AWS CUR data.  You will create an Amazon Athena view for analyzing AWS data transfer costs details.  See the Amazon Athena documentation for help on Working with Views.

Run the following query to create the data_transfer_usage view and select cost and usage records specific to data transfer in AWS CUR, replacing aws_billing_report.my_cur_report with your Athena table name:

CREATE OR REPLACE VIEW data_transfer_usage AS 
SELECT DISTINCT
  bill_billing_period_start_date
, line_item_usage_account_id
, line_item_product_code
, line_item_line_item_type
, line_item_operation
, line_item_line_item_description
, line_item_usage_type
, line_item_resource_id
, product_product_name
, product_region
, product_from_location
, product_to_location
, sum((line_item_usage_amount) / 1024) "TB"
, sum(line_item_blended_cost) "cost"
FROM
  aws_billing_report.my_cur_report
WHERE ((((line_item_usage_type LIKE '%Bytes%') AND ((((line_item_usage_type LIKE '%In%') OR (line_item_usage_type LIKE '%Out%')) OR (line_item_usage_type LIKE 'Nat%')) OR (line_item_usage_type LIKE '%Regional%'))) AND ((product_from_location = '') OR (product_from_location LIKE '%(%'))))
GROUP BY  bill_billing_period_start_date, line_item_usage_account_id, line_item_product_code, line_item_line_item_type, line_item_operation, line_item_line_item_description, line_item_usage_type, line_item_resource_id, product_product_name, product_region, product_from_location, product_to_location

For more information on the AWS Cost and Usage Reports fields, see the AWS Cost and Usage Report Data Dictionary documentation.

Create three pieces of Amazon QuickSight visuals for data transfer cost analysis

After you successfully create the Amazon Athena data_transfer_usage view, you must create an Amazon QuickSight dataset for access to the Athena data_transfer_usage view.

To create a Dataset, data_transfer_usage, in the Amazon QuickSight, select DataSets followed by “New Data DataSet”.

It will redirect you to the “Create a Data Set” section. Under this section, click on the ‘”Athena”, enter the Data source name as ‘data_transfer_usage’ and click on ‘Create Data Source’.

Then, you will see a pop-up screen, where you can select the required “Database” and “Tables”.  For “Tables”, choose the view “data_transfer_usage” and click on “Select”. Finally, you may select either “Import to SPICE for quicker analytics” or “Directly query your data”, and then click on “Visualize”.  SPICE (Super-fast, Parallel, In-Memory Calculation Engine) is a fully managed data store that enables fast visualizations.  See Import Data into SPICE documentation for more information.

See the Amazon QuickSight documentation for help on Creating a Dataset Using Amazon Athena.

You will use this QuickSight dataset to create the following three pieces of QuickSight visuals:

1.  Monthly Data Transfer Cost and Usage You can use this visual to monitor the monthly trend for data transfer cost and usage.  The cost bar groups data transfer usage types for the month into different colors.  The line shows the total amount of data transfer in terabytes (TB) for each month.

2.  Top 10 Resource Data Transfer Cost and Usage Type  You can use this visual to identify the top 10 resources driving the data transfer costs. The example visual below greyed out the resource identifications to protect customer privacy.  The resource identifications can be specific Amazon EC2 resource IDs, Amazon S3 bucket name, or Amazon Resource Names (ARNs) of AWS resources.  Each resource bar groups the data transfer usage type for that resource into different colors.

3. Data Transfer Cost From and To Regions  You can use this visual to identify the top source and destination regions of the data transfer costs.  The top data transfer charges between AWS regions, between Availability Zones, to the Internet (External), or NAT Gateway data transfer costs are highlighted by the colors in the heat map.  In the example below, the highest data transfer cost (close to $240K), in red, is from “US East (N. Virginia)” region to “US East (N. Virginia)” region.  This will be data transfer cost between Availability Zones in the same region.  The 2nd highest data transfer cost (close to $120K), in orange, is between “empty” and “empty”.  This will be the data transfer costs for NAT Gateway, which are not between regions.


The table below shows the parameters to create each Amazon QuickSight visual.  You will use the parameters with the steps below to create each visual:

1.   Select a Visual type, as outlined in the Amazon QuickSight documentation for Creating an Amazon QuickSight Visual.

2.   Select the fields from the Fields lists pan at left for use in the field wells for the visual type.  Apply function and format to the fields below for clarity.  See the Amazon QuickSight documentation for help on Changing Field Aggregation and Customizing a Field Format.

  • bill_billing_period_start_date (Aggregate by Month)
  • cost (Shown as: Currency)
  • TB (Format: 1,234)

3.   Create the necessary filters and sorting particular for the visual.  See the Amazon QuickSight documentation for help on Filtering Visual Data in Amazon QuickSight and Sorting Visual Data in Amazon QuickSight.

a.   To set a date range for the visual, create a filter on the field bill_billing_period_start_date with filter type of Time Range.

4.   Set display options for your data visualization as outlined in the Amazon QuickSight documentation for Formatting a Visual in Amazon QuickSight.

You can use this Amazon QuickSight dataset to create additional visuals for analysis.

Conclusion

In this walk-through, you successfully created Amazon QuickSight visuals and an Amazon Athena query for analysis of the AWS data transfer cost details in the AWS Cost and Usage Reports.  You are now ready to create additional Amazon QuickSight visuals to dive deeper into your AWS data transfer costs.