Desktop and Application Streaming
Build an enterprise cost and usage dashboard for Amazon WorkSpaces
Customers often ask how they can gain insight into their Amazon WorkSpaces cost and usage. They’d like to query and visualize their cost and usage data for many reasons:
- Understand how many hours per month each individual Amazon WorkSpace is actually used. WorkSpaces with low or no usage at all can then be terminated based on this data.
- A near real-time visual representation of the usage data helps identify usage trends and forecast future usage. It also helps selecting the most appropriate Running Mode of WorkSpaces (AlwaysOn / AutoStop).
- Querying cost and usage data using ANSI SQL allows to extract the data most valuable for the business. For example, a business providing training to external users may want to report the number of usage hours outside of business hours.
Overview
In this blog, you use AWS Cost and Usage Reports, AWS Glue, Amazon Athena, and Amazon QuickSight to perform data integration, query, and visualize the AWS Cost and Usage Report data. You review sample queries against Amazon Athena and build an Amazon QuickSight Dashboard to visualize the data.
Time to read | 20 minutes |
Time to complete | 45 minutes |
Cost to complete (estimated) |
$5, plus an Amazon QuickSight subscription |
Learning level | 300 |
Services used |
Walkthrough
In this blog, you will complete the following tasks:
- Setup AWS Cost and Usage Reports
- Configure Amazon Athena and run the AWS Glue Crawler
- Use Amazon Athena to query your AWS Cost and Usage Reports for Amazon WorkSpaces Cost and Usage data
- Create an Amazon QuickSight dashboard to visualize your Cost and Usage data
Prerequisites
For this walkthrough, you should have the following prerequisites:
- An AWS account
- Permissions to the following services:
- AWS Cost and Usage Reports, Amazon S3, AWS CloudFormation, AWS Glue, Amazon Athena, Amazon QuickSight
- Amazon WorkSpaces in one or more AWS Regions
Step 1: Setting up AWS Cost and Usage Reports
To create a Cost and Usage Report, follow the steps in the AWS Cost and Usage Reports User Guide.
Use the following parameters:
- For the Report Name, enter MyWorkSpacesUsage.
- In the Additional report details section, select Include resource IDs. It contains the WorkSpace ID for Amazon WorkSpaces related line items.
- For S3 bucket, enter a new bucket name and choose the Region you want to deploy this solution.
- For Report path prefix, enter MyCUR_Athena.
- For Time granularity, choose Hourly.
- Under Enable report data integration for, choose Amazon Athena.
After enabling AWS Cost and Usage Reports it takes up to 24 hours before the first data becomes available. You can return to the Cost and Usage Reports section of your billing console to see when your reports were last updated. Once the first report has been delivered to your Amazon S3 bucket, you can continue to step 2.
Step 2: Configure Amazon Athena and run the AWS Glue Crawler
In step 1, you enabled report data integration for Amazon Athena. This has created an AWS CloudFormation template in the Amazon S3 bucket you chose. In this step, you use the template to create an AWS Glue crawler, an AWS Glue database, and an AWS Lambda event.
Setting up Amazon Athena
To set up Athena using the AWS CloudFormation template, follow the steps in the AWS Cost and Usage Reports User Guide. Use the S3 bucket, report name and, report path prefix you chose in step 1.
Once the CloudFormation stack has been created, you’ll be ready to run the AWS Glue crawler to populate the AWS Glue database.
Configuring and running the AWS Glue Crawler
The AWS Glue crawler is not configured to run by default. In this step, you configure the AWS Glue Crawler to run daily.
- Navigate to the AWS Glue console.
- In the navigation pane, under Data catalog choose Crawlers.
- Select your crawler (name starting with AWSCURCrawler-)
- Choose Edit.
- Under Step 4: Set output and scheduling, choose Edit.
- In the Crawler schedule section, choose the following:
- For Frequency select Daily
- Choose Next, then choose
After the first run of the AWS Glue crawler, you can start using Amazon Athena to query your Amazon WorkSpaces Cost and Usage data. You can choose Run crawler and move to step 3.
Step 3: Using Amazon Athena to query your AWS Cost and Usage Reports for Amazon WorkSpaces Cost and Usage data
Amazon Athena allows you to query your AWS Cost and Usage Reports using standard ANSI SQL query language. In this section you review sample queries to get specific Amazon WorkSpaces Cost and Usage data from Athena.
Querying Amazon WorkSpaces Cost and Usage data
Example 1
For the first example, the goal is to understand the total monthly cost per WorkSpace.
Follow these steps to run this query:
- Navigate to the Athena console.
- For Data Source select
- For Database select athenacurcfn_my_work_spaces_usage.
- In this example, query the myworkspacesusage You see two tables on the left:
– The table cost_and_usage_data_status, which you already used for the data status.
– The table myworkspacesusage holds the Cost and Usage data. - Enter the following statement into the Query 1 field and choose Run.
SELECT line_item_usage_account_id AS LinkedAccount, product_region AS Region, product_running_mode AS RunningMode, Year, Month, SPLIT_PART(line_item_resource_id,'/',2) AS WorkSpaceID, -- WorkSpaceID SUM(CAST(line_item_blended_cost AS double)) AS BlendedCost FROM myworkspacesusage -- NOTE: Modify to match your CUR table name WHERE product_product_name = 'Amazon WorkSpaces' AND line_item_line_item_type NOT IN ('Tax','Credit','Refund','EdpDiscount','Fee','RIFee') GROUP BY line_item_usage_account_id, product_region, product_running_mode, Year, Month, line_item_resource_id ORDER BY LinkedAccount, Region, RunningMode, WorkSpaceID, Year, Month
The WHERE clause of this SQL query has two elements that need to match to return a record:
Query | Description |
product_product_name = ‘Amazon WorkSpaces’ | Cost and Usage data on Amazon WorkSpaces only. |
line_item_line_item_type NOT IN (‘Tax’,’Credit’,’Refund’,’EdpDiscount’,’Fee’,’RIFee’) | Exclude certain line item types from the results. |
If your Cost and Usage report contains matching records, you’ll get a table with the following data:
Table column | Description |
LinkedAccount | AWS account in which the usage occurred |
Region | AWS Region in which the WorkSpace was provisioned |
RunningMode | Running mode of the WorkSpace (AutoStop or AlwaysOn) |
Year | Year |
Month | Month |
WorkSpaceID | WorkSpace ID |
BlendedCost | Blended total cost of the WorkSpace in the month and year |
If you have tagged your WorkSpaces with a Name resource tag, your Cost and Usage report will contain a resource_tags_user_name column. You may add this column to the SELECT statement and the GROUP BY clause of your SQL statement above in order to include the Name tag in the results.
Example 2
You want to determine how many hours each of our WorkSpaces ran outside of business hours? In this example, between 06:00 p.m. CET and 07:00 a.m. CET. The goal is for data per day for a specific month and year, and o get some additional detail on the WorkSpace. With AutoStop WorkSpaces, this might be useful if you’re giving external users access and would like to understand if they make use outside of business hours.
Follow these steps to run this query:
- Enter the following into Query 1 field and choose Run.
SELECT line_item_usage_account_id AS LinkedAccount, DATE_FORMAT((line_item_usage_start_date),'%Y-%m-%d') AS Day, SPLIT_PART(line_item_resource_id,'/',2) AS WorkSpaceID, -- WorkSpaceID SPLIT_PART(product_bundle,'-',1) AS ProductBundle, product_operating_system AS OS, product_memory AS Memory, product_storage AS Storage, product_vcpu AS VCPU, product_running_mode AS RunningMode, product_region AS Region, SUM(CAST(line_item_usage_amount AS double)) AS UsageQuantity FROM myworkspacesusage -- NOTE: Modify to match your CUR table name WHERE (year='2021' AND month='11') -- NOTE: Modify these values to fit your needs AND product_product_name = 'Amazon WorkSpaces' AND pricing_unit = 'Hour' AND line_item_line_item_type NOT IN ('Tax','Credit','Refund','EdpDiscount','Fee','RIFee') AND DATE_FORMAT((line_item_usage_start_date),'%H') IN ('17','18','19','20','21','22','23','00','01','02','03','04','05','06') -- times are UTC GROUP BY line_item_usage_account_id, DATE_FORMAT((line_item_usage_start_date),'%Y-%m-%d'), line_item_resource_id, product_bundle, product_operating_system, product_memory, product_storage, product_vcpu, product_running_mode,product_region ORDER BY Day
The WHERE clause of the SQL query has several elements that all need to match for the query to return a record:
Query | Description | |
(year=’2021′ AND month=’11’) | The data for a specific year and month. | |
product_product_name = ‘Amazon WorkSpaces’ | Get CUR data on Amazon WorkSpaces only. | |
pricing_unit = ‘Hour’ | Hourly fees only. | |
line_item_line_item_type NOT IN (‘Tax’,’Credit’,’Refund’,’EdpDiscount’,’Fee’,’RIFee’)E | Exclude certain line item types from the result. | |
DATE_FORMAT((line_item_usage_start_date),’%H’) IN (’17’,’18’,’19’,’20’,’21’,’22’,’23’,’00’,’01’,’02’,’03’,’04’,’05’,’06’) | Show only those records where the hour part of the usage start date falls into the range 18:00 – 07:00 CET. Note the time in the CUR report is UTC. |
If there are matching records, you’ll get a table with the following data:
Table column | Description |
LinkedAccount | AWS account in which the usage occurred |
Day | Day on which the usage occurred |
WorkSpaceID | WorkSpace ID |
ProductBundle | WorkSpaces Bundle the WorkSpace was launched with |
OS | Operating System of the WorkSpace |
Memory | Amount of RAM assigned to the WorkSpace |
Storage | Size of system and user volume of the WorkSpace |
VCPU | Number of vCPUs assigned to the WorkSpace |
RunningMode | Running mode of the WorkSpace (AutoStop or AlwaysOn) |
Region | AWS Region in which the WorkSpace was provisioned |
UsageQuantity | Number of hours the WorkSpace ran in the specified timeframe |
Further reading
Review the Data dictionary section in the AWS Cost and Usage Reports User Guide to learn more about columns available in your Cost and Usage database table. You get some random sample data of Amazon WorkSpaces related records with all columns by running the following query:
SELECT * from "myworkspacesusage" TABLESAMPLE BERNOULLI (50) WHERE product_product_name = 'Amazon WorkSpaces' LIMIT 100
This returns 100 random records, which you can download in CSV format.
Step 4: Create an Amazon QuickSight dashboard to visualize your Cost and Usage data
To visualize your Amazon WorkSpaces Cost and Usage data, you can create an Amazon QuickSight dashboard. Follow these steps to create a simple dashboard showing you the total cost of your WorkSpaces per running mode and month in a stacked bar combo chart.
- If you don’t have an Amazon QuickSight subscription, follow signing up for an Amazon QuickSight subscription in the Amazon QuickSight User Guide.
- Navigate to the Amazon QuickSight console.
- In the navigation pane, select Choose New analysis, then select New dataset.
- Select Athena as data source
- Enter myworkspacesusage as Data source name and select [primary] from the Athena workgroup
- Choose Create data source.
- In the Choose your table dialog, choose Use custom SQL.
- In the Enter custom SQL query dialog, enter myworkspacesusage as name for the query, then copy the below SQL code into the field underneath the query name.
- Choose Confirm query.
WITH num_running_mode AS ( SELECT year, month, line_item_resource_id, COUNT(DISTINCT product_running_mode) AS num_running_mode FROM "athenacurcfn_my_work_spaces_usage"."myworkspacesusage" -- NOTE: Modify this FROM clause to match your cur database and table name WHERE line_item_product_code = 'AmazonWorkSpaces' AND product_resource_type = 'Hardware' GROUP BY year, month, line_item_resource_id ) SELECT mc.*, nrm.num_running_mode FROM "AwsDataCatalog"."athenacurcfn_my_work_spaces_usage"."myworkspacesusage" AS mc -- NOTE: Modify this FROM clause to match your cur database and table name JOIN num_running_mode AS nrm ON (mc.year = nrm.year AND mc.month = nrm.month AND mc.line_item_resource_id = nrm.line_item_resource_id) WHERE mc.line_item_product_code = 'AmazonWorkSpaces' AND product_resource_type = 'Hardware' ORDER BY mc.bill_billing_period_start_date DESC
- In the Finish dataset creation dialog choose Import to SPICE for quicker analysis. Then choose Visualize.
Note: Importing your data into SPICE will require refreshing the dataset on a schedule. - In the Fields list pane, choose line_item_blended_cost, line_item_usage_start_date, and product_running_mode.
Selecting a field selects it and the background color changes. Select it again to unselect it.
Amazon QuickSight uses AutoGraph to create the visual. AutoGraph selects the visual type it considers most appropriate with the selected fields. Here it selects a line chart showing the total cost, by running mode, by day. - Expand the Field wells pane at the top by choosing the expand icon in the upper right corner.
- Choose the X axis field well, choose Aggregate, then select Month.
- Choose the Value field well, choose Show as, then select Currency.
- Finally, in the Visual types pane, change the Visual Type to Stacked bar combo chart.
Further reading
Read new charts, formatting, and layout options in Amazon QuickSight to create more visualizations.
If you want to add parameters and controls to drive interactivity, review using QuickSight parameters and controls to drive interactivity in your dashboards.
Clean up resources
Unless you plan to continue using Amazon Athena or QuickSight to analyze your WorkSpaces usage data, remove the resources to avoid ongoing costs.
To pause the crawler, follow these steps:
- Navigate to the AWS Glue console.
- Select Data catalog, Crawlers.
- Select your crawler starting with AWSCURCrawler-.
- Choose Edit.
- Select Schedule in the menu.
- Use the drop-down menu under Frequency to select Run on demand, then choose
- Choose Next again, then choose
If you want to remove the resources created using the CloudFormation template, follow these steps:
- Navigate to the CloudFormation console.
- Select the CloudFormation stack name that you chose when you created the CloudFormation stack in step 2.
- Choose Delete.
- A warning appears: Deleting this stack will delete all stack resources. Resources will be deleted according to their DeletionPolicy.
- Double-check the stack name to confirm this is the correct stack. Then choose Delete stack.
If you want to delete the AWS Cost and Usage Reports that you created, follow these steps:
- Navigate to the Billing console.
- Choose Cost and Usage Reports.
- Take note of the S3 bucket used for your reports if you want to review or delete the data in there.
- Select the checkbox in front of the report(s) you want to delete.
- Choose Delete.
- A warning dialog: Are you sure you want to delete <YOURREPORTNAME>. Double-check the report name to confirm the report to be deleted. Then chose
- Check the S3 bucket you noted for data you want to keep. If no longer required, you may delete the data and S3 bucket.
To unsubscribe from Amazon QuickSight, navigate to the Amazon QuickSight unsubscribe page and choose unsubscribe. This approach works no matter which AWS Regions you use. It deletes all data, analyses, Amazon QuickSight users, and Amazon QuickSight administrators. You can’t undo this action.
Conclusion
In this post you:
- Setup AWS Cost and Usage Reports
- Configure Amazon Athena and run the AWS Glue Crawler
- Use Amazon Athena to query your AWS Cost and Usage Reports for Amazon WorkSpaces Cost and Usage data
- Create an Amazon QuickSight dashboard to visualize your Cost and Usage data
Review Cost Optimizer for Amazon WorkSpaces, which automatically analyzes your Amazon WorkSpaces usage data. The WorkSpaces can be automatically converted to the most cost-effective billing option (hourly or monthly), depending on their individual usage.
If you want to build more in-depth and granular Dashboards covering a wider range of AWS services, review the Cloud Intelligence Dashboard section on the AWS Well-Architected website.