AWS HPC Blog

Putting bitrates into perspective

Recently, we talked about the advances NICE DCV has made to push pixels from cloud-hosted desktops or applications over the internet even more efficiently than before. Since we published that post on this blog channel, we’ve been asked by several customers whether all this efficient pixel-pushing could lead to outbound data charges moving up on their AWS bill. These are the “data-out” fees you see on your bill each month. They’re metered on the data flowing out of the cloud across the internet, and are typically quite small, often falling into the free tier.

Usually, the best answer to any question you might have about the cloud is “just give it a try”, since the cost of experimentation is small. Since we heard this question from several customers in a short space of time, we decided to try it on your behalf, and share the details with you in this post. The bottom line? The charges are unlikely to be significant unless you’re doing the kind of intensive streaming that gamers do, and there are easier optimizations (like AWS Instance Savings Plans) that will have more impact.

Background

You might recall that – using a new transport called QUIC (RFC 9000) – DCV is able to mask even more of the effects of distance, so end-users running complex and graphically-intensive applications in the cloud feel like they’re just across campus from the data center. They don’t see “buffering” messages, and the video stream doesn’t stall when there’s ad hoc congestion on the internet somewhere.

The kinds of things that impact streaming performance vary. Latency, bandwidth, packet-rates, and reliability all factor into whether a user will notice that the connection between their desktop and the server is anything less than perfect. But these are network supply-side factors. The demand-side is about how many pixels we try to push down a connection of varying (and probably unpredictable) integrity. DCV works to optimize this equation by only moving pixels from parts of the screen that have changed and only retransmitting fragments of frames (caused by lost packets) when it’s necessary.

The combination of these kinds of optimizations are what let us continuously innovate ahead of voracious pixel-generating industries, like gaming or live streaming. You can reasonably expect to stream 4K gaming content up to 60 frames per second (FPS) over a decent domestic-grade internet connection to your house. In the broader scheme of things, this is amazing, and it relies on many technology advances in the last 20 years that far outstrip the growth in network bandwidth, which you might have assumed was the primary factor.

Predicting these in advance sometimes feels like guess work for many customers using the cloud for the first time. That’s because in a traditional, on-premises, environment you pay a single, up-front (and often quite large) fee to have an always-on internet connection for your whole data center. It costs you money, whether you’re using it or not, and you must know many months (or years) in advance how fat that pipe needs to be to satisfy all your users (and you probably never will). The cloud was built to reinvent all of that, and in doing so the mantra ‘pay only for what you use’ became the operating principle. If you’re wondering why we don’t have a data-in charge: it’s got a lot to do with the fact that data movement over the internet is an incredibly lop-sided equation. Just measuring data-out pretty much covers it.

Our setup

Given that DCV is used by a diverse set of customers, we needed to simulate several environments to make sure we weren’t misrepresenting anyone’s usage pattern. We settled on testing a range of screen resolutions from 1024×768 (Standard Definition, or SD), 1920×1080 (High Definition, or HD) and 3840×2160 (4K). The difference is around 5x the number of pixels. As you scale through that range, however, you’ll find that applications and GPU boards capable of pushing 4K are also likely running at greater framerates – including 60 FPS for the most intensive scenarios.

Those scenarios had to vary, too. Our starting point was a simple document-editing or slide-preparation session using Microsoft Office. Next, we simulated a CAD/CAE environment using Paraview to manipulate a complex 3D structure undergoing fluid dynamics analysis. Finally, we stressed everything (including our home broadband connections) by streaming 4K highly-animated content using content from YouTube, and some game benchmarks that’re widely used in the industry to punish GPUs (we used Heaven and Superposition in our tests).

The advantage of the game benchmark is that it simulates the kind of action frequently seen in a game environment –tens or hundreds of objects are moving at the same time in all directions in some thrilling moment of the adventure.

As a baseline, we ran all our tests on an Amazon EC2 g4dn.xlarge instance, which has a single NVIDIA T4 GPU, 4 vCPUs and 16 GB of RAM. Your choice of instance should match the intensity of the graphics performance you need in your application. It’ll also change if you’re sharing the GPU between multiple users or streams, which you can do with DCV. You might do this if you’re running a video streaming service rather than an engineering design company. You can see our results below and make your own judgements about how you depart from this baseline.

Our tests

We ran our tests several times for each scenario, and for long periods (15 mins to more than an hour) to make a reasonable assessment of what an hour’s consumption would look like. We used DCV’s own built-in streaming-mode monitor (which you can see in Figure 1) to check what the peak and average usage was.

Figure 1 shows DCV's Streaming Mode panel, which shows framerates, latency, and bandwidth consumption figures.

Figure 1: DCV’s Streaming Mode panel, which shows framerates, latency, and bandwidth consumption figures.

We normalized the average bit-rates over the session times to hourly rates to make it easier to estimate the experience if we sustained the behavior over a full working day. It’s unlikely that this exactly matches your scenario. Every workload is different – and very few slide creators or design engineers are going to sit motionless at their desk for 8 hours continuously without a bathroom break. That means that these estimates are likely high-water marks. Nonetheless, these were our assumptions and this knowledge should help you to assess the applicability of the results to the situation you’re comparing them to in your company.

If you want to visually inspect the scenarios to get a better feeling for the framerates, and image movement, you can watch them yourself and listen to our discussion in the HPC Tech Short channel. Hopefully you’ll see a scenario you recognize from your own workplace and patterns, and can use our numbers and math to base your own estimates upon.

Figure 2: We showed the scenarios we tested in our HPC Tech Shorts channel, which you can watch online at hpc.news/techshorts.

Figure 2: We showed the scenarios we tested in our HPC Tech Shorts channel, which you can watch online at hpc.news/techshorts.

Results

The results are shown in Figure 2. In each scenario we tested, we turned the average bitrate into an hourly figure and multiplied it by the current data-out charge for the US-East-1 AWS Region to arrive at an hourly cost for data transmission over the internet. Then, we compared this hourly cost to the hourly price of the g4dn.xlarge instance we chose, so we could evaluate the data movement charges as a percentage of the overall charges for the workload, to put it in context. You’ll see why this matters when you look at the table.

Figure 3 - Test results across a range of scenarios, resolutions and frame rates. Data consumption only really becomes significant (> 50% above the instance cost) when aggresively streaming gaming content.

Figure 3: Test results across a range of scenarios, resolutions and frame rates. Data consumption only really becomes significant (> 50% above the instance cost) when aggresively streaming gaming content.

The impact of the pixel streaming on overall costs is trivial for document editing, slide preparation, and even CFD visualization – in short: the workloads that are the kind of interactive, desktop-oriented applications. Even for HD resolutions – the most impactful amongst this class of cases – the data charges would only lift the overall cost of EC2 instance by ~9-13%. Keep in mind, this assumes the operator sits at her desk for hour, after hour, constantly stimulating the application to keep the pixels in motion. That’s just not practical, and any normal human would eventually need some time away from the screen to think about the design problem while they rummaged around in the kitchen making coffee. In short: if the application you’re streaming is an interactive one, and desktop-oriented like these are, then you can afford to offer your staff HD monitors and the costs will be negligible (in fact, you’ll get greater savings from optimizing the instances you’ve chosen, or considering savings plans). You should also use QUIC as your transport, because the interactivity it enables is vital for these kinds of activities, and it didn’t impact the bandwidth consumption at all.

If the workload you’re building for, however, involves video streaming, or perhaps a significant degree of over-subscription (multiple users running desktops and sharing a GPU, which DCV enables) then data movement charges begin to nibble at your overall bill. In these cases, it ranges from 20%-30% above the basic instance price, and becoming most significant (a 67% uplift) when aggressively streaming in 4K resolution.

But it’s gaming where we really saw the impact: starting at a 45% uplift to the basic instance charge, eventually peaking at 60-80%. Given the network congestion we experienced on the day of testing, QUIC made for a higher bandwidth experience than TCP, leading to higher data consumption (on our other testing days we didn’t see such a contrast). If you’re a game company, offering 4K at high frame rates (30-60 FPS) to your customers, then the judgement is what you probably knew already: you owe it to yourself – and your customers – to do a Proof of Concept (POC) with your workload. Any data here is only a sketch outline of what you might expect, and: your mileage will vary a lot. Chances are you’ll be using a lot of instances for way more than 8 hours a day and streaming multiple sessions off each one. So many things will factor in to your overall costs, that you are your own benchmark at this point.

Conclusions

DCV is a highly efficient pixel streaming protocol. It makes fast and accurate decisions about which regions of the screen to send down the pipe, and it makes efficient use of the pipe itself, to the point of compensating for the inadequacies of the unpredictable medium that is the internet. With the wide range of scenarios we tested, most appear to generate such small amounts of actual traffic that the charges incurred are negligible – at least compared to the other charges for instances themselves (and not accounting for licensed applications, which would only emphasize this point). However, if you’re planning to aggressively stream 4K ultra-high-definition video, and certainly if you’re going to do so at the high frame rates that gamers expect, you should do the experiment yourself, since the data we present here is only a guide.

Luckily, doing that experiment is easy: you can find a lot of resources for getting started on the DCV home page, including a video tutorial to step you through the process. There are even AMIs in AWS Marketplace with DCV server pre-installed.

Brendan Bouffler

Brendan Bouffler

Brendan Bouffler is the head of the Developer Relations in HPC Engineering at AWS. He’s been responsible for designing and building hundreds of HPC systems in all kind of environments, and joined AWS when it became clear to him that cloud would become the exceptional tool the global research & engineering community needed to bring on the discoveries that would change the world for us all. He holds a degree in Physics and an interest in testing several of its laws as they apply to bicycles. This has frequently resulted in hospitalization.

Jyothi Venkatesh

Jyothi Venkatesh

Jyothi Venkatesh is an HPC Solutions Architect at AWS focused on building optimized solutions for HPC customers in different industry verticals including Health Care Life Sciences and Oil & Gas. Prior to joining AWS, she has spent close to 10 years in HPC both as a software engineer working on Parallel IO contributing to OpenMPI as well as a systems engineer at Dell leading the engineering development of Lustre Storage Solution for the HPC storage portfolio. She holds an M.S. in Computer Science from University of Houston.