Analyze your Amazon AppStream 2.0 usage reports using Amazon Athena and Amazon QuickSight

Amazon AppStream 2.0 now lets you subscribe to usage reports that provide detailed information about how users are using the service. The reports include how long users are streaming and which applications they are launching.

Usage reports are stored as separate .csv files in your Amazon S3 bucket, which you can download and analyze using third-party BI tools. However, you may want to analyze your usage data in AWS without downloading your reports or create reports over custom date ranges without concatenating multiple .csv files.

In this post, I show you how to use Amazon Athena and Amazon QuickSight to create custom reports and visualizations of your AppStream 2.0 usage data, all within AWS.

Walkthrough

This post shows you how to perform the following tasks:

Enable AppStream 2.0 usage reports.
Create an AWS Glue crawler.
Create the AWS Glue Data Catalog using the crawler.
Use Amazon Athena to create custom usage reports.
Use Amazon QuickSight to visualize your usage data.

I’ve created an AWS CloudFormation stack to help you create the resources required.

Prerequisites

You must have sufficient AWS Glue, Athena, and IAM permissions to create and run the CloudFormation stack. Ask your AWS account administrator to either perform these steps in your account or grant you the IAM permissions described in Step 7 of Create an AWS Glue crawler.

To use Amazon QuickSight to visualize your usage data, make sure that your account is signed up to use Amazon QuickSight.

Open the Amazon QuickSight console at https://quicksight.aws.amazon.com/.
Choose the AWS Region in which you are going to enable usage reports.
If you are prompted to sign up for Amazon QuickSight, choose Sign up for QuickSight. For information about pricing, see Amazon QuickSight Pricing.

Enable AppStream 2.0 usage reports

To receive usage reports, you must subscribe to them using the AppStream 2.0 console, the AWS CLI create-usage-report-subscription command, or the CreateUsageReportSubscription API operation.

You can start or stop your subscription to usage reports at any time. There is no charge for subscribing to usage reports, but standard S3 changes may apply to reports that are stored in your S3 bucket. For more information, see Amazon S3 Pricing.

To subscribe to usage reports by using the AppStream 2.0 console, perform the following steps.

Open the AppStream 2.0 console at https://console.aws.amazon.com/appstream2.
Choose the AWS Region for which to enable usage reports.
In the navigation pane, choose Usage Reports.
Choose Enabled, Apply.

After you subscribe, for each day that users launch one or more sessions in your AWS account, AppStream 2.0 stores two reports in your Amazon S3 account:

Sessions—Contains a separate record for each streaming session launched during a day. More than 30 fields are included, such as the user session ID, user ID, session duration, stack name, and user IP address.
Applications—Contains a separate record for each application launched during a streaming session.

For more information about the fields in each report, see AppStream 2.0 Usage Reports Fields in the AppStream 2.0 Developer Guide.

If you enabled on-instance session scripts and S3 logging for your session script configuration, AppStream 2.0 created an S3 bucket to store the script output. The bucket is unique to your account and Region. When you enable usage reports in this case, AppStream 2.0 uses the same bucket to store your usage reports. If you haven’t already enabled on-instance session scripts, when you enable usage reports, AppStream 2.0 creates a new S3 bucket in the following location:

appstream-logs-region-code-account-id-without-hyphens-random-identifier

AppStream 2.0 stores each sessions report in a nested folder structure in your S3 account using the following folder path:

[bucket_name]/sessions/schedule=DAILY/year=[YYYY]/month=[MM]/day=[DD]/

AppStream 2.0 stores each applications report in a nested folder structure in your S3 account using the following folder path:

[bucket_name]/applications/schedule=DAILY/year=[YYYY]/month=[MM]/day=[DD]/

If you choose to query your reports using Athena, this nesting structure facilitates partitioning. You can download individual reports by navigating to these locations in the S3 console.

Create an AWS Glue crawler

Amazon Athena is a serverless, interactive query service that you can use to analyze data stored in your S3 buckets by using standard SQL queries. AWS Glue is a fully managed extract, transform, and load (ETL) service. It allows you to create a database (also referred to as the AWS Glue Data Catalog) from your S3 data and to query that database using Athena.

You can manually create a database and tables in Athena for your usage data by following the steps in the Amazon Athena Getting Started Guide and listing all fields described in AppStream 2.0 Usage Reports Fields in the AppStream 2.0 Developer Guide. Alternatively, you can use an AWS Glue crawler to automatically detect the schema of your S3 data and create a corresponding database and tables. This post describes how to use an AWS CloudFormation template to create an AWS Glue crawler.

Completing the steps in the following procedure creates but does not start an AWS Glue crawler. To start the crawler, you must perform the steps in the section, Create the AWS Glue Data Catalog by using the crawler. For more information, see Defining Crawlers.

Open the AppStream 2.0 console at https://console.aws.amazon.com/appstream2.
Choose the AWS Region for which you have subscribed to usage reports.
In the navigation pane, choose Usage Reports, and verify that usage reports logging is enabled.
In the Report Details tab, in the paragraph next to Analytics, choose CloudFormation template.
Choosing the link opens the AWS CloudFormation console, where you can review the parameters of the CloudFormation stack specified by the template before you run it. The template, when run, creates an AWS Glue crawler and several sample Athena queries.
On the Specify Details page, keep the default values, and choose Next.
On the Options page, keep the default values, and choose Next.
On the Review page, select the check box next to “I acknowledge that AWS CloudFormation might create IAM resources with custom names,” and then choose Create.

Scheduling a crawler

The AWS Glue crawler creates the Data Catalog and schema that are mapped to the structure of your sessions and applications reports. Each time a new report is stored in your S3 bucket, you must run the crawler to update your AWS Glue Data Catalog with the data from the new report.

By default, the crawler is configured to run every day, which is important for two reasons. First, each time that a new report is added to your S3 bucket, running the crawler adds that data to your database. If you do not run the crawler after a new report is added, or otherwise update your tables using the MSCK REPAIR TABLE function in Athena, any SQL queries that you run in Athena will not include data from the new report in their results. Second, if AppStream 2.0 adds new fields to usage reports in the future, the crawler will automatically update your tables to include those new fields.

To change the frequency with which the crawler runs, specify a different cron value next to ScheduleExpression on the Specify Details page. For more information, see Cron Expressions.

Create the AWS Glue Data Catalog by using the crawler

Charges may apply each time you run the crawler. For information about AWS Glue crawler fees, see AWS Glue Pricing.

To run the crawler and create the Data Catalog, perform the following steps.

Open the AWS Glue console at https://console.aws.amazon.com/glue.
Choose the AWS Region for which you have subscribed to usage reports.
Select the check box next to the crawler named appstream-usage-sessions-crawler, and choose Run crawler. Repeat this step for the crawler named appstream-usage-apps-crawler.

Screenshot of AWS Glue crawlers created by the CloudFormation template

Performing these steps runs the crawlers and schedules them to run automatically according to the schedule specified in the CloudFormation stack.

After both crawlers finish running, in the navigation pane, choose Databases. The database named appstream-usage represents your usage reports and displays. This database is the AWS Glue Data Catalog that was created when appstream-usage-sessions-crawler and appstream-usage-apps-crawler were run.
To view the tables in the database, choose appstream-usage, Tables. You see two tables, applications and sessions, which represent your usage reports. Choose either table to view its schema.

You can now query these tables in Athena by using SQL.

Use Athena to create custom usage reports

You can use Athena to aggregate your usage reports or to generate other types of custom reports. To query your usage reports by using Athena, perform the following steps.

Open the Amazon Athena console at https://console.aws.amazon.com/athena.
Choose the AWS Region for which you have subscribed to usage reports.
Under Database, choose appstream-usage.
In the query pane, enter a SQL query and choose Run query.

Working with Athena queries

The following are SQL queries that you can run in Athena to analyze the usage reports data in your S3 bucket.

To create a consolidated report of all sessions in a given month, run the following query:

SELECT *
FROM "appstream-usage"."sessions"
WHERE year='four-digit-year'
AND month='two-digit-month'

You can also perform join operations between the applications and sessions tables in your query. For example, to view the distinct users who launched each application in a given month, run the following query:

SELECT DISTINCT apps.application_name, sessions.user_id
FROM "appstream-usage"."applications" apps
INNER JOIN "appstream-usage"."sessions" sessions ON (apps.user_session_id = sessions.user_session_id AND sessions.year=’four-digit-year' AND sessions.month='two-digit-month')
WHERE apps.year='four-digit-year'
AND apps.month='two-digit-month'
ORDER BY 1, 2

Athena query results are stored as .csv files in an S3 bucket in your account that is named aws-athena-query-results-account-id-without-hyphens-region-code. For ease in locating query results, choose Save as and provide a name for your query before you run it.

You can also choose the download icon to download the results of the query as a .csv file from within the Athena console.

Screenshot of an Amazon Athena query

To enhance performance and reduce costs, Athena uses partitioning to reduce the amount of data scanned in queries. Usage reports are partitioned in your S3 bucket by year, month, and day. You can restrict your queries to certain date range partitions using the year, month, and day fields as conditions in your queries. For more information about using partitions when querying your usage reports, see Create Custom Reports and Analyze Usage Data.

If a session spans more than one day, the session and application records appear in the sessions and applications reports, respectively, corresponding to the day in which the session ended. To find records relating to all sessions that were active during a given date range, consider expanding the partition set of your query by the maximum session length that you have configured for your fleets.

For example, to view all sessions that were active for a given fleet during a calendar month, where the fleet had a maximum session duration of 100 hours, expand your partition set by five days.

SELECT *
FROM "appstream-usage"."sessions"
WHERE fleet_name = 'fleet_name'
AND session_start_time BETWEEN '2019-05-01' AND '2019-06-01'
AND year='2019' AND (month='05' OR (month='06' AND day<='05'))
ORDER BY session_start_time

The CloudFormation template that created the AWS Glue crawlers also created and saved several sample queries in your Athena account that you can use to analyze your usage data. These sample queries include the following:

Aggregated monthly session report
Average session length per stack
Number of sessions per day
Total streaming hours per user
Distinct users per app

To use any of these queries, perform the following steps.

Open the Athena console at https://console.aws.amazon.com/athena.
Choose Saved Queries. You should see all of the above queries listed, each of which begins with “AS2”— for example, “AS2_users_per_app_curr_no.”
To run a query, choose the query name rather than the radio button next to the name.
The text of the query appears in the query pane. Choose Run query.

To view these queries in a separate CloudFormation template, see athena-sample-queries-appstream-usage-data_template.yml in the AWS Code Sample Catalog.

Use Amazon QuickSight to visualize your usage data

Create a data set in Amazon QuickSight based on your usage reports data.

Open the Amazon QuickSight console at https://quicksight.aws.amazon.com/
Choose the AWS Region for which you have subscribed to usage reports.
Choose New analysis, New data set, Athena.
For Data source name, enter AppStream Session Reports, and choose Create data source.
Choose Use custom SQL.
In the dialog box, enter Current Month Sessions to name the query, and enter the following SQL query into the text area:

SELECT *,
DATE_FORMAT(FROM_ISO8601_TIMESTAMP(session_start_time), '%m/%d') AS session_launch_day, (CASE WHEN EXTRACT(DAY_OF_WEEK FROM FROM_ISO8601_TIMESTAMP(session_start_time)) <= 6 THEN 'No' ELSE 'Yes' END) AS weekday,
ROUND(CAST(session_duration_in_seconds AS DOUBLE) / 3600, 2) AS session_duration_in_hours
FROM "appstream-usage"."sessions"
WHERE year = DATE_FORMAT(current_date, '%Y')
AND month = DATE_FORMAT(current_date, '%m')
AND session_start_time <= DATE_FORMAT(current_date, '%Y-%m-%d')
ORDER BY session_start_time

Choose Confirm query, Import to SPICE for quicker analytics, Visualize.

You can use this data set to create a number of visuals in Amazon QuickSight, including the visuals that follow.

Sessions by day

Choose Add, Add visual to create a new visual.
Under Visual types, choose the vertical bar chart.
In the Fields list, choose session_launch_day.
In the Fields list, choose weekday to add the field to the Group/Color box.
In the box for X axis, select session_launch_day, Sort by, az.
In the top right of the visual, select the downward-facing arrow, and choose Format visual.
Under Format visual, add appropriate labels, such as Date for the X axis, and Sessions for the Y axis. Select and edit the title in the visual to indicate the nature of the chart (for example, Sessions by day).

Amazon QuickSight bar chart of sessions per calendar day

Streaming hours per user

Choose Add, Add visual to create a new visual.
Under Visual types, choose the horizontal bar chart.
In the Fields list, choose user_id for the Y axis.
In the Fields list, choose session_duration_in_hours for Value.
In the top right of the visual, select the downward-facing arrow, and choose Format visual.
Under Format visual, add appropriate labels, such as User for the Y axis and Streaming Hours for the X axis. Select and edit the title in the visual to indicate the nature of the chart (for example, Streaming Hours per User).

Amazon QuickSight bar chart of streaming hours per user

Cleanup

If you don’t plan to use Athena or Amazon QuickSight to analyze your AppStream 2.0 usage data after completing this walkthrough, remove or pause resources to avoid ongoing costs.

Open the AWS Glue console at https://console.aws.amazon.com/glue.
Choose the AWS Region for which you have subscribed to usage reports.
Select the check box next to the crawler named appstream-usage-sessions-crawler, and then choose Action, Pause schedule. Repeat this step for the crawler named appstream-usage-apps-crawler.
If you have a paid subscription to Amazon QuickSight, you can cancel your subscription by following the steps at Canceling Your Amazon QuickSight Subscription and Closing the Account.

Conclusion

And that’s it! You now have:

The AWS Glue Data Catalog populated with your AppStream 2.0 usage reports data, updated daily, against which you can run SQL queries in Athena
A dozen SQL queries saved to your Athena account for the various types of custom reports that you might want to generate
A data set of your usage reports data from which you can create visuals in Amazon QuickSight

In this post, I showed you how to analyze your AppStream 2.0 usage data in AWS to create custom reports and visualizations. For more information, see Amazon AppStream 2.0 Usage Reports.

Desktop and Application Streaming