Desktop and Application Streaming

Track user processes in Amazon AppStream 2.0 sessions

Introduction

Many customers utilizing Amazon AppStream 2.0 want to track employee usage of specific applications. This data can be used to track the frequency/duration of application use, and help optimize licensing costs. In addition, built-in AppStream 2.0 usage reports record applications launched from the application catalog, but not applications launched from desktop shortcuts or from other applications.

This blog walks you through deploying a solution that tracks the name and duration of user-launched applications. This multi-step deployment creates a database of all processes launched by users. This data can be visualized using Amazon QuickSight or another BI tool of your choice.

Time to read 30 minutes
Time to complete 60 minutes
Cost to complete $0.10 to run a stream.standard.medium image builder for one hour in the US East (N. Virginia) region. Ongoing monthly cost is $0.023 per GB of data stored in Amazon S3 Standard, and $.005 per GB to query the data via Amazon Athena
Learning level Advanced (300)
Services used AWS Identity and Access Management (IAM), Amazon AppStream 2.0, Amazon Simple Storage Service (Amazon S3), Amazon Athena

Prerequisites

You should be familiar with the AWS Management Console, how to create an Amazon S3 bucket, AppStream 2.0 service (create an image builder, deep knowledge of AppStream 2.0 configuration), IAM service (create IAM role, trusts), Windows PowerShell scripting and Athena service functionalities.

Solution overview

Architecture Diagram

Diagram showing elements and data flow inside instance.

Section 1: Build the solution

  1. Create a new Amazon S3 bucket in the AWS Region where you will deploy your AppStream 2.0 environment. This bucket will be used to save the generated files. Record the name of the S3 bucket for use in later scripts. For more information, see Creating a bucket. Retain the default security settings of Block Public Access and ACLs disabled.  For more information, see Creating a bucket.Retain the default security settings of Block Public Access and ACLs disabled
  2. AppStream 2.0 requires an IAM role to write to S3. Create an IAM role by following the steps on How to Create an IAM Role to Use With AppStream 2.0 Streaming Instances. Use the following IAM policy and attach it to the role. Replace DOC-EXAMPLE-BUCKET with your bucket name. Record the name of the IAM role for use later.
    {
        "Version": "2012-10-17",
        "Statement": [
         {
                "Effect": "Allow",
                "Action": [ "s3:PutObject" ],
                "Resource": [
                    "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*"
                ]
            } 
        ]
    }
    
  3. From the AppStream 2.0 service console, create a new image builder. Assign the IAM role created earlier in step #2. For more information about creating an image builder with an IAM Role see step 4 on Launch an Image Builder to Install and Configure Streaming Applications.
  4. Connect to the image builder and install your apps.
  5. Open a Command prompt and create a folder in C:\ by running mkdir C:\user_scripts .  Create the file C:\user_scripts\Get-UserEnvironmentVars.ps1 . Additional variables  are recorded for the output later. Paste the following lines of code into the Get-UserEnvironmentVars.ps1:
    # Record user variables for later use.
    Get-ItemPropertyValue 'HKCU:\Environment' 'AppStream_Stack_Name' > C:\user_scripts\UserVars.txt
    Get-ItemPropertyValue 'HKCU:\Environment' 'AppStream_Session_ID' >> C:\user_scripts\UserVars.txt
    Get-ItemPropertyValue 'HKCU:\Environment' 'AppStream_UserName' >> C:\user_scripts\UserVars.txt
    Get-ItemPropertyValue 'HKCU:\Environment' 'AppStream_Session_Reservation_DateTime' >> C:\user_scripts\UserVars.txt
    Get-ItemPropertyValue 'HKCU:\Environment' 'AppStream_User_Access_Mode' >> C:\user_scripts\UserVars.txt
    
  6. Configure session scripts as shown below into the file C:\appstream\sessionscripts\config.json.

    a. Session start scripts enable Windows event viewer audit tracking under “system” context and recording the session values under “user” context.

    b. A session termination script records the list of all relevant launched and terminated processes, and exports them to the target S3 bucket in the “system” context.

    c. For more information about see session scripts Use Session Scripts to Manage Your AppStream 2.0 Users’ Streaming Experience.

    {
        "SessionStart":{
            "executables":[
                {
                    "context":"system",
                    "filename":"C:\\Scripts\\EnableAuditTracking.bat",
                    "arguments":"",
                    "s3LogEnabled":true
                },
                {
                    "context":"user",
                    "filename":"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
                    "arguments":"-File \"C:\\user_scripts\\Get-UserEnvironmentVars.ps1\"",
                    "s3LogEnabled":true
                }
            ],
            "waitingTime":30
        },
        "SessionTermination":{
            "executables":[
                {
                    "context":"system",
                    "filename":"C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
                    "arguments":"-File \"C:\\scripts\\Get-ProcessDetails.ps1\"",
                    "s3LogEnabled":true
                },
                {
                    "context":"user",
                    "filename":"",
                    "arguments":"",
                    "s3LogEnabled":false
                }
            ],
            "waitingTime":60
        }
    }
    
  7. Create a script to Windows audit logging of process creation (event ID 4688) and process termination events (event ID 4689). Neither are captured by default.

    a. Open a Command prompt and create a folder in C:\ by running mkdir C:\scripts. Create C:\Scripts\EnableAuditTracking.bat and paste the following lines of code into the file.

    auditpol.exe /set /category:"Detailed Tracking" /subcategory:"Process Creation" /success:enable"
    auditpol.exe /set /category:"Detailed Tracking" /subcategory:"Process Termination" /success:enable"
    REM Allow Get-UserEnvironmentVars.ps1 to write UserVars.txt when its running in User context by granting permission to C:\user\scripts.
    c:\windows\system32\icacls c:\user_scripts\* /grant "everyone":(OI)(CI)F
    
  8. Create a script to generate the output values for each process creation and termination.

    a. Create C:\scripts\Get-ProcessDetails.ps1 and paste the lines of code from 8.e

    i. For each process there can be two rows. The first row will have event type 4688, and the second row will have event type 4689.

    ii. Sometimes when a process is not terminated before the session is terminated, the row of type 4689 may not be present. In that case, treat the session termination time as the process termination time. This is captured via the last row in the file with pid = -1.

    b. In the code, there is filter criteria using the “If” condition. This filter will remove Windows services and other processes that aren’t launched by the user.

    c. This will record the process-ids of all processes that ran during the session.

    d. The captured content is uploaded to the S3 bucket with the prefix /appstream_applications/year=<year>/month=<month>/day=<day>. Using Apache Hive style partitions improves the efficiency of Amazon Athena queries. For more information, see Partitioning data in Athena. If you need to use a different partitioning scheme, record the new scheme and update the query table name created later.

    e. In the last line of the script, replace DOC-EXAMPLE-BUCKET with the actual bucket name.

    $outputPath = 'C:\Scripts\ProcessDetails.csv'
    $Date = Get-Date
    
    $instanceType = [Environment]::GetEnvironmentVariable('AppStream_Instance_Type', 'machine')
    $region = [Environment]::GetEnvironmentVariable('AWS_REGION', 'machine')
    $imageArn = [Environment]::GetEnvironmentVariable('AppStream_Image_Arn', 'machine')
    $fleet = [Environment]::GetEnvironmentVariable('AppStream_Resource_Name', 'machine')
    
    $user_params = (Get-Content -Path C:\user_scripts\UserVars.txt)
    $stack = $user_params[0]
    $sessionId = $user_params[1]
    $userName = $user_params[2]
    $startTime = $user_params[3]
    $authType = $user_params[4]
    
    $4688_var = Get-WinEvent -FilterHashTable @{
     LogName = 'Security'
     Id = 4688
    }| Select-Object TimeCreated,@{name='NewProcessName';expression={ $_.Properties[5].Value }}, @{name='HostName'; expression = {$_.Properties[1].Value}}, @{name='PID'; expression = {$_.Properties[4].Value}}, @{name='UserId' ; expression = { $_.Properties[0].Value}}
    
    # Write CSV header
    Set-Content -Path $outputPath -Encoding UTF8 -Value 'event_type,date,time,process,host,pid,stack,session,username,time_of_executionprocess_start_time,instance_type,region,image_arn,fleet,auth_type,userid,process_time'
    # Write-Output("event_type,date,time,process,host,pid,stack,session,username,time_of_executionprocess_start_time,instance_type,region,image_arn,fleet,auth_type,userid,process_time")
    ForEach ( $i in $4688_var) {
    
     if( $i.NewProcessName -notlike '*wbem*' -And $i.NewProcessName -notlike '*System32*' -And $i.NewProcessName -notlike '*dcv*' -And $i.UserId.Value -like '*S-1-5-21*' -And $i.NewProcessName -notlike '*Amazon*' ) {
    
     $newDate = $i.TimeCreated.ToShortDateString()
     $newTime = $i.TimeCreated.ToString("HH:mm:ss")
     Add-Content -Path $outputPath -Encoding UTF8 -Value (
     (
     '4688',
     $newDate,
     $newTime,
     $i.NewProcessName,
     $i.HostName,
     $i.PID,
     $stack,
     $sessionId,
     $userName,
     $startTime,
     $instanceType,
     $region,
     $imageArn,
     $fleet,
     $authType,
     $i.UserId,
     $i.TimeCreated.ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ssZ')
     ) -join ','
     )
     }
    }
    $4689_var = Get-WinEvent -FilterHashTable @{
     LogName = 'Security'
     Id = 4689
    }| Select-Object TimeCreated,@{name='NewProcessName';expression={ $_.Properties[6].Value }}, @{name='HostName'; expression = {$_.Properties[1].Value}}, @{name='PID'; expression = {$_.Properties[5].Value}}, @{name='UserId' ; expression = { $_.Properties[0].Value}}
    
    ForEach ( $i in $4689_var) {
    
     if( $i.NewProcessName -notlike '*wbem*' -And $i.NewProcessName -notlike '*dcv*' -And $i.NewProcessName -notlike '*System32*' -And $i.UserId.Value -like '*S-1-5-21*' ) {
     $newTime = $i.TimeCreated.ToString("HH:mm:ss")
     $newDate = $i.TimeCreated.ToShortDateString()
     Add-Content -Path $outputPath -Encoding UTF8 -Value (
     (
     '4689',
     $i.TimeCreated.ToShortDateString(),
     $i.TimeCreated.ToString('HH:mm:ss'),
     $i.NewProcessName,
     $i.HostName,
     $i.PID,
     $stack,
     $sessionId,
     $userName,
     $startTime,
     $instanceType,
     $region,
     $imageArn,
     $fleet,
     $authType,
     $i.UserId,
     $i.TimeCreated.ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ssZ')
     ) -join ',')
     }
    }
     # Write session end pseudo-event
    Add-Content -Path $outputPath -Encoding UTF8 -Value (
     (
     '4689',
     $date.ToShortDateString(),
     $date.ToString('HH:mm:ss'),
     'session_end',
     'session_end',
     '-1',
     $stack,
     $sessionId,
     $userName,
     $startTime,
     $instanceType,
     $region,
     $imageArn,
     $fleet,
     $authType,
     'session_end',
     $date.ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ssZ')
     ) -join ','
    )
    
    $s3Key = '/appstream_applications/year={0:yyyy}/month={0:MM}/day={0:dd}/{1}.csv' -f $date, $sessionId
    Write-S3Object -BucketName DOC-EXAMPLE-BUCKET -Key $s3Key -File $outputPath -ProfileName appstream_machine_role
  9. Optionally, create startup.bat, with the content below, to grant access to the C:\scripts folder for debugging. Leaving this enabled in production is not recommended.
    echo "Running startup.bat"
    c:\windows\system32\icacls c:\scripts\* /grant "everyone":(OI)(CI)F
    

    To invoke  add this line C:\scripts\startup.bat to the file EnableAuditTracking.bat script created in step 7.

  10. To test any of the previous scripts on an image builder, ensure that your command prompt and PowerShell session are run as an administrator to avoid permission errors. If you do run any of the previous scripts on the image builder, delete C:\scripts\UserVars.txt and C:\scripts\ProcessDetails.csv. If the files from the test are included in the image, then the data won’t be generated for the user session.
  11. Create your AppStream 2.0 image. For more information refer to Create a Custom AppStream 2.0 Image by Using the AppStream 2.0 Console.
  12. Use the image to create a new fleet and apply the IAM role that was created in step 1.
  13. Associate the fleet with an existing or newly created stack. Each user session that uses this new image will generate a separate file in the location specified in $path in the script #8.

Note that the IAM role and bucket information is not available to the users. However, if the user is given information about the bucket, they may discover they have the ability to upload files to S3 bucket due to the IAM role being associated with the fleet.

Section 2: Query the data

  1. You can use Amazon Athena to access the query the data directly in S3.
  2. From the Athena console:
    1. Choose Query editor.
    2. Run this command to create the database CREATE DATABASE appstream_usage.
    3. Run the following command to create table to query the data from S3. Ensure that you replace DOC-EXAMPLE-BUCKET with the actual bucket name.
CREATE EXTERNAL TABLE `appstream_usage`.`appstream_applications`(
  `event_type` bigint, 
  `date` string, 
  `time` string, 
  `process` string, 
  `host` string, 
  `pid` bigint, 
  `stack` string, 
  `session` string, 
  `username` string, 
  `process_start_time` string, 
  `instance_type` string, 
  `region` string, 
  `image_arn` string, 
  `fleet` string, 
  `auth_type` string, 
  `userid` string,
  `process_time` string)
PARTITIONED BY ( 
  `year` string, 
  `month` string, 
  `day` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://DOC-EXAMPLE-BUCKET/appstream_applications/'
TBLPROPERTIES (

  'areColumnsQuoted'='false', 
  'classification'='csv', 
  'columnsOrdered'='true', 
  'compressionType'='none', 
  'delimiter'=',',
  'skip.header.line.count'='1', 
  'typeOfData'='file')
  1. From the query editor make sure to run command MSCK REPAIR TABLE appstream_applications . You may have to run this command daily to load the daily partitioned information into Athena for query. This can be automated. For more information see MSCK REPAIR TABLE.
  2. If the session ends before the process is terminated, there may not be a row of event_type 4689 for that process.
  3. Here are some sample queries that allow you to query for the usage patterns

a. Get all rows for a given day. You can replace the ‘?’ with a specific date to filter your results. Make sure you put the day field within single quotes (‘). All times are recorded in UTC.

SELECT t1.event_type, t1.date, t1.time starttime ,t1.pid, t1.process, t1.username, t2.time endtime , t1.year, t1.month, t1.day, t2.event_type t2et, t2.pid t2pid
FROM
 "appstream_usage"."appstream_applications" as t1 left outer join "appstream_usage"."appstream_applications" as t2
on t1.pid = t2.pid and t2.event_type = 4689 and t1.event_type = 4688
and t1.year = t2.year and t1.month = t2.month and t1.day = t2.day
    where t1.day = '?' and t1.event_type = 4688

b. All processes run and the number of times they were run.

select process, count(process) from "appstream_usage"."appstream_applications" group by process

c. All processes run for a specific user.

Replace ‘?’ with a specific username.

select * from "appstream_usage"."appstream_applications" where userName like '?%'

d. All processes for a specific month

In the following query replace ‘?’ with the year and ‘??’ with the month

select * from "appstream_usage"."appstream_applications" where year = '?' and month = '??'

e. Top 10 processes by time consumed

As not all processes are terminated before the session ends, there may not be a row with event type 4689. To compensate for this, the script adds a last row of type 4689 and pid = -1 which records the time of when the session has ended and use that in the query and calculate the total_process_time.

with t3 as ( 
SELECT t1.event_type, t1.process_time t1process_time,  t1.pid, t1.process, t1.username, t1.session,
 t1.process_start_time t1_process_time, 
t1.year, t1.month, t1.day,
t2.event_type t2eventtype, t2.process_time t2process_time
FROM
 "appstream_usage"."appstream_applications" as t1 left  outer join "appstream_usage"."appstream_applications" as t2
on t1.pid = t2.pid and t2.event_type = 4689 and t1.event_type = 4688 and t1.session = t2.session and t1.year = t2.year and t1.month = t2.month and t1.day = t2.day  
 where  t1.event_type = 4688 
 )
 select t3.*,coalesce (t2process_time, t4.process_time ) as endtime, date_diff ( 'second', from_iso8601_timestamp( t3.t1process_time), 
 from_iso8601_timestamp( coalesce (t2process_time, t4.process_time )) ) as total_process_time 
 from t3 left outer join  "appstream_usage"."appstream_applications" as t4
 on t3.session = t4.session and t4.pid = -1 order by total_process_time desc limit 10 

To analyze and dashboard your usage reports, you can integrate Athena with a variety of BI tools such as Amazon QuickSight or other 3rd party tools.

Cleaning Up

Once testing is complete and if you no longer need this solution, delete all the files from the DOC-EXAMPLE-BUCKET S3 bucket. Delete the AppStream 2.0 stack, fleet, and image builder that you created earlier. Delete the IAM role that you created. Delete the Athena database.

Conclusion

In this post, you deployed an application usage solution for Amazon AppStream 2.0. For more information about Amazon see What Is Amazon AppStream 2.0?. For additional information about Amazon Athena, see What is Amazon Athena?.

Author Bio

Vishal Lakhotia is a Senior Solutions Architect at Amazon Web Services focused on accelerating cloud adoption and ensuring customer success leveraging AWS Cloud for business outcomes. He is a subject matter expert on End User Computing services and Edge Services. He can be connected on LinkedIn.
Dylan Barlett is a Senior End User Computing Solutions Architect at AWS. He works with global commercial customers to deploy and optimize AWS EUC services. In his spare time, he enjoys traveling and home improvement projects.