Desktop and Application Streaming

Scale your Amazon AppStream 2.0 fleets

AppStream 2.0 customers have told us that they appreciate the ability to scale their fleets based on the user demand. In AppStream 2.0, you can scale your application streaming for any number of users across the globe without purchasing, provisioning, and operating on-premises hardware or infrastructure. You pay only for the streaming resources that you use, and a small fee per monthly authorized user.

This blog post describes the techniques that you use to scale your AppStream 2.0 fleets. If you’re getting started with AppStream 2.0, we recommend using a Getting Started project before reading further. For more information, see Getting started with Amazon AppStream 2.0.

Using fleets in AppStream 2.0

An AppStream 2.0 fleet contains streaming instances launched with an image, instance type, domain, VPC, and scaling policies. The important points about a fleet:

  • Each streaming instance in a fleet supports a single streaming connection. The number of instances in a fleet at any time maps to the number of users that can be supported.
  • Each instance is based on the same image containing the same app catalog.
  • Instances in a fleet are non-persistent and terminated after each use. When a user’s streaming connection ends, the streaming instance connection established is terminated. It is replaced by a new instance to match the desired fleet size.
  • Instances that don’t receive a connection for about a day are automatically terminated and replaced.

A fleet can be created in on-demand or always-on modes. In on-demand mode, instances in the fleet are in a stopped state waiting for a connection. Once a streaming request is assigned to an instance, the instance is started. Then, a connection is established with the user making the request. It takes 1–2 minutes for a connection to start when using an on-demand fleet. Instances in a stopped state are charged a stopped fee, and instances to which there are connections are charged a running fee per hour. The stopped fee is the same across all instances in an AWS Region. The running fee changes based on instance type.

In always-on mode, instances in the fleet are in a running state waiting for a connection. Once a streaming connection is assigned to an instance, the connection is immediately started. It usually takes about 10 seconds for a connection to start using an always-on fleet. Instances in an always-on fleet are always charged a running fee per hour. The running fee changes based on instance type. For more information about instance pricing, see Amazon AppStream 2.0 pricing.

Scaling policies

Scaling policies determine the size of a fleet. You can automatically increase or decrease the size of the fleet based on utilization-based scaling and schedule-based scaling policies. Scaling policies use fleet metrics as inputs for making changes to the fleet size. These metrics are collected in Amazon CloudWatch. These are the two key metrics that AppStream 2.0 uses:

  • Capacity Utilization – Thepercentage of instances in a fleet that are being used. Use this metric to scale your fleet based on usage of the fleet.
  • Available Capacity – Thenumber of instances in your fleet that are available for user sessions. Use this metric to maintain a buffer in your capacity available for users to start streaming sessions.

For more information about the metrics emitted by AppStream 2.0, see Monitoring Amazon AppStream 2.0 resources.

Scaling based on usage

You can define policies can scale up or down the fleet capacity, based on fleet usage. You determine this by using either Capacity Utilization or Available Capacity.

These utilization-based automatic scaling policies operate between two fleet capacity boundaries:

  • Minimum capacity – The minimum size of the fleet. Scaling policies do not scale your fleet below this value. For example, if you specify 2, your fleet never has fewer than two instances available.
  • Maximum capacity– The maximum size of the fleet. Scaling policies do not scale your fleet above this value. For example, if you specify 4, your fleet never has more than four instances available.

You can create utilization-based scaling policies for your fleet by using the AWS Management Console, AWS SDK, or AWS Command Line Interface (AWS CLI).  Begin by setting a minimum and maximum capacity for your fleet, as shown in the following example. You can enter this information in the AWS Management Console, on the Scaling Policies tab under AppStream 2.0 Fleets.

You can then create scale out policies to increase the fleet size when the user demand grows. Likewise, create scale in policies to decrease the fleet size when user demand drops. To create a new scaling policy, choose Add Policy and fill in the details. The following section shows how to create an example scale out policy.

Create an example scale out policy

Create a scale out policy by entering the following information:

  1. Policy name– A unique name for the scaling policy. For this example, it is set to default-scale-out.
  2. If – The fleet metric that initiates a scaling trigger, either Capacity Utilization or Available Capacity. For this example, the value is Capacity Utilization.
  3. Is– A condition that must be met to trigger a scaling action. For this example, capacity utilization > 75% is set as the trigger.
  4. Then add – The scaling action to be performed. You can either increase or decrease the fleet capacity by an absolute number of instances or as a percentage of the fleet capacity. For this example, two instances are added.

Similar to the scale out policy, this example uses a scale in policy created to reduce ActualCapacity when CapacityUtilization is low.

After these policies are set, they function on the AppStream 2.0 fleet as shown in the following screenshot example. The charts in the following examples plot the ActualCapacity in blue, on the left axis. These are the number of streaming instances available. They also plot the CapacityUtilization, the percentage of capacity use, in brown, on the right axis. This capacity information is shown as streaming connections are created and ended.

At 15:06 on November 6, the ActualCapacity is 2 and CapacityUtilization is 0 percent. There are no streaming connections active.

At 15:44 on November 6, the CapacityUtilization has increased to 100 percent. There are two active streaming connections using the entire ActualCapacity (2 instances). This triggers the default-scale-out policy, which adds two more instances.

At 15:57 on November 6, the CapacityUtilization has decreased to 50 percent while the ActualCapacity has increased to four.

At 16:18 on November 6, CapacityUtilization has dropped to 33 percent because one of the streaming connections has ended. This leads to ActualCapacity also decreasing from 4 to 3.

You can set policies to both scale out (increase) and scale in (decrease) the number of instances in a fleet. For more information about utilization-based scale out and scale in policies, see Fleet Auto Scaling for Amazon AppStream 2.0.

Schedule-based scaling

You can create scaling policies that set a desired fleet capacity based on a time-based schedule. You can scale policies to automatically increase or decrease the fleet size at a particular time of the day, between a given date range, or for every number of hours.  Scheduled scaling policies for an AppStream 2.0 fleet can only be created or edited using AWS SDK or AWS CLI. For information about installing the AWS CLI, see Installing AWS CLI.

To create a scheduled action:

  1. Register a scalable target with the Application Auto Scaling service by using the register-scalable-targetAPI operation. The scalable target is the AppStream 2.0 fleet resource whose capacity is adjusted. Perform this action only once.
    $>aws application-autoscaling register-scalable-target --service-namespace appstream --resource-id fleet/sample-fleet --scalable-dimension appstream:fleet:DesiredCapacity
  2. Create a scheduled action against the AppStream 2.0 fleet by using the put-scheduled-actionAPI operation.
    $> aws application-autoscaling --service-namespace appstream --scheduled-action-name ExamplePolicy --resource-id fleet/sample-fleet --scalable-dimension appstream:fleet:DesiredCapacity --scalable-target-action MinCapacity=2,MaxCapacity=5  --schedule <cron-expression> [--start-date] [--end-date]

    The following parameters are part of put-scheduled-action:

    1. service-namespace– The name of the service whose resources are scaled. This should be: appstream.
    2. scheduled-action-name– The name of the scheduled action set to ExamplePolicy.
    3. resource-id– The name of the AppStream 2.0 fleet. This should be set to fleet/sample-fleet. Replace sample-fleetwith your fleet name.
    4. scalable-dimension– This is the AppStream 2.0 fleet capacity that you want to set. The value should be set to appstream:fleet:DesiredCapacity.
    5. scalable-target-action– The scaling action to be performed. Set the desired values for minimum and maximum capacity for the fleet.
    6. schedule– The repeated schedule when the scaling should happen. This can be a cron expression.
    7. start-date and end-date– The date ranges in which the scaling policy is active.
  3. Review the scheduled actions associated with the fleet by using the describe-scheduled-actionsAPI operation.
    $>aws application-autoscaling describe-scheduled-actions --service-namespace appstream --resource-id fleet/sample-fleet


For more information about these API operations, see Application Auto Scaling CLI Reference or Application Auto Scaling API reference.

Schedule-based scaling example

Example Corp. is a software vendor that wants to use AppStream 2.0 to deliver online trials of their desktop application to a browser. Any customer can visit their website, register for an account, sign in, and start a trial. For this scenario:

  1. Example Corp. wants to provide users with instant access to their application without any wait time. To do this, they use an always-on fleet.
  2. Example Corp. expects the following usage patterns from their customers:
    1. Weekdays – 8:00 to 20:00 – 50 to 100 users
    2. Weekdays – 20:00 to 00:00 – 25 to 50 users
    3. Weekdays – 00:00 – 8:00 – 10 to 25 users
    4. Weekends – Throughout – 10 – 25 users
    5. Auto scale based on demand to accommodate demand spikes during the time schedules.
  3. Their scheduled actions would be as shown in the following example. Remember that the time input to automatic scaling API calls is in UTC timezone format. Convert your schedules to UTC timezone before making the API calls.
    # Register the fleet as the scaling target with the application autoscaling service
    $>aws application-autoscaling register-scalable-target --service-namespace appstream --resource-id fleet/samplefleet --scalable-dimension appstream:fleet:DesiredCapacity
    # Scheduled actions 
    $>aws application-autoscaling put-scheduled-action --service-namespace appstream --resource-id fleet/samplefleet --scheduled-action-name Policy1 --schedule="cron(0 0 8 ? * MON-FRI *)" --scalable-target-action MinCapacity=50,MaxCapacity=100 --scalable-dimension appstream:fleet:DesiredCapacity
    $>aws application-autoscaling put-scheduled-action --service-namespace appstream --resource-id fleet/samplefleet --scheduled-action-name Policy2 --schedule="cron(0 0 21 ? * MON-FRI *)" --scalable-target-action MinCapacity=25,MaxCapacity=50 --scalable-dimension appstream:fleet:DesiredCapacity
    $>aws application-autoscaling put-scheduled-action --service-namespace appstream --resource-id fleet/samplefleet --scheduled-action-name Policy3 --schedule="cron(0 0 0 ? * MON-FRI *)" --scalable-target-action MinCapacity=10,MaxCapacity=25 --scalable-dimension appstream:fleet:DesiredCapacity
    $>aws application-autoscaling put-scheduled-action --service-namespace appstream --resource-id fleet/samplefleet --scheduled-action-name Policy4 --schedule="cron(0 0 0 ? * SAT-SUN *)" --scalable-target-action MinCapacity=10,MaxCapacity=25 --scalable-dimension appstream:fleet:DesiredCapacity
    # To know the scheduled actions associated with a fleet 
    $>aws application-autoscaling describe-scheduled-actions --service-namespace appstream --resource-id fleet/samplefleet

    These scheduled actions increase the minimum and maximum capacity of fleets to desired values. Between the minimum and maximum values, they can also layer utilization-based scaling policies to scale out or scale in their AppStream 2.0 fleet based on user demand. To add the utilization based scaling policies, they can use the AWS Management Console.


This post provided techniques for scaling your AppStream 2.0 fleets using utilization-based scaling and schedule-based scaling policies. For more information about the information in this post, see:

  1. Monitoring Amazon AppStream 2.0 Resources
  2. Fleet Auto Scaling for Amazon AppStream 2.0
  3. Application Auto scaling service – CLI reference
  4. Application Auto scaling service – API reference