AWS Compute Blog
Deploying a 4x4K, GPU-backed Linux desktop instance on AWS
Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services
AWS currently supports many managed desktop delivery mechanisms. Amazon WorkSpaces and Amazon AppStream 2.0 both deliver managed Windows-based machine images with GPU-backed instances. However, many desktop services and applications are better served through a Linux backed instance. Given the variety of Linux distributions as well as desktop managers, it can be valuable to have a generic solution for provisioning a Linux desktop on Amazon EC2.
A GPU-backed instance reduces the computational requirements from the client (local) machine, eliminating the need for a local discrete GPU to run graphical workloads. The framebuffer objects generated by the GPU are compressed when sent over the network, and decompressed by the local CPU resources. This allows clients to take advantage of the server GPU and display the high-resolution content on local thin clients, mobile devices, and low-powered desktops and laptops. Such GPU-backed Linux instances have been used for VFX rendering, computational drug discovery, and computational fluid dynamics (CFD) simulation use cases. An upcoming followup post details enabling this technology on the Windows platform.
Configuration
In this configuration, a client machine connects to the provisioned desktop (server) in the cloud. The server captures the framebuffer, which is sent in real time to the client machine over the network. Thus latency is an important metric to consider when provisioning this solution. I recommend choosing the nearest AWS Region (under 100 ms). Some customers may even prefer to install AWS Direct Connect.
Region | Latency | |
---|---|---|
US-East (Virginia) | 18 ms | |
US East (Ohio) | 31 ms | |
US-West (California) | 77 ms | |
US-West (Oregon) | 97 ms | |
Canada (Central) | 29 ms | |
Europe (Ireland) | 89 ms | |
Europe (London) | 90 ms | |
Europe (Frankfurt) | 108 ms | |
Asia Pacific (Mumbai) | 197 ms | |
Asia Pacific (Seoul) | 198 ms | |
Asia Pacific (Singapore) | 288 ms | |
Asia Pacific (Sydney) | 218 ms | |
Asia Pacific (Tokyo) | 188 ms | |
South America (São Paulo) | 138 ms | |
China (Beijing) | 267 ms | |
AWS GovCloud (US) | 97 ms |
Source: http://www.cloudping.info/ from the Amazon offices located in Herndon, VA
Bandwidth requirements depend on the quality of the desktop experience as well as the desired resolution. Provision the backend Linux desktop instance with a 4096×2160 (4K) resolution. Depending on the specific G3 instance type selected, multi-GPU managed desktops give additional performance benefits. Each instance can also host multiple users, either in collaborative sessions, or with up to four independent 4K monitors. The GPU framebuffer memory used per session generally limits the number of sessions per managed desktop.
A smooth reliable experience depends on a low latency and high-bandwidth connection to the EC2 instance hosting the desktop. One of the benefits of using a multithreaded framebuffer reader is that only the defined block of the rendered desktop that is changing needs to be sent over the network. Full-screen redraws may be necessary only in rare cases. The minimum requirements for this 4K (3840×2160) configuration are as follows:
- Bandwidth: 50 Mbps
- Latency: < 30 ms
- Jitter: < 5 ms
Deployment
Use RHEL/CentOS for the deployment. Except for DCV, this stack is compatible with Debian/Ubuntu distributions. Use the CentOS 7.5 Server AMI and install the NVIDIA/Xorg/KDE stack to create a fully functioning desktop environment with a max resolution of 16384 x 8640 (that is, 4x4K) at 60 Hz.
This stack contains the following software:
- CentOS 7.5 Base
- Xorg 1.19
- NVIDIA Grid Driver 6.1 (for the G3 instance family)
- KDE Desktop environment
- VirtualGL
- TurboVNC
- NICE DCV
To make the most efficient use of the NVIDIA Tesla M60 framebuffer memory, disable the compositing features of the desktop manager. Other non-compositing desktop managers (such as XFCE, MATE, etc.) are supported as well. This ensures that the GPU is reserved for specific OpenGL API tasks for the application, and that the performance is not impacted by the desktop environment decorations.
Start up a CentOS 7.5 server desktop based on the latest AMI available in the closest Region:
Now install the Xorg stack with the KDE desktop manager:
Download the NVIDIA Grid driver (6.1). For more information, see Installing the NVIDIA Driver on Linux Instances.
Deposit the xorg.conf file in /etc/X11/xorg.conf:
Reboot again and check that the nvidia-gridd service is running. You may notice errors. They can be safely ignored after the nvidia-gridd service successfully acquires a license.
You can confirm that 4K resolution is enabled by running the following command:
Finally, check that your underlying GL renderer is using the NVIDIA driver by querying glxinfo
At the time of publication, OpenGL 4.5 is enabled. Your applications can take advantage of that API for rendering.
To interact with the instance, install server-side desktop remote display software that can specifically take advantage of the 3D hardware acceleration. For example, AWS provides the NICE DCV platform.
DCV is an accelerated remote desktop framework that provides in-web browser desktop connections. DCV is supported in both Windows and Linux (RHEL/CentOS). In the Windows platform, OpenGL and DirectX are fully supported. DCV entitlement is free when provisioning on AWS. NICE DCV is also provided as a component to the AWS EnginFrame and myHPC solutions.
To install DCV, download the NICE DCV 2017 EL7 archive and Administrative Guide. After you extract the archive in the instance, you see a list of nice-* RPMS. You don’t have to worry about licensing, as the installer captures that the instance is running in AWS.
When the DCV server starts, you have the option to create a single console session or multiple virtual sessions. You must assign a password for the CentOS user issued, by running the following command:
Start the console session:
The AWS security groups are enabled to allow TCP 8443 traffic to the instance. You see the DCV login portal and can interact with the instance. Other popular frameworks include the following:
You can also find plug and play images for managed desktops in the AWS Marketplace.
Optimization
Implement the changes outlined in the Optimizing GPU Settings (P2, P3, and G3 Instances) topic. You can turn off the autoboost feature and set the maximum graphics and memory clocks manually.
Application testing
For testing, look at PyMOL (PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.). PyMOL is a standard commercial drug discovery application that is used for processing, and visualizing biochemical structures. I used the opensource fork.
With the NVIDIA GRID licensing enabled earlier, PyMOL can take advantage of the Quadro features supplied by the Tesla M60. After it’s installed and loaded, you can confirm the functionality of the entire G3 instance software stack installed earlier:
In the PyMOL window, run “fetch 5ta3”, which is a 39k amino acid protein, under the 4K desktop environment. Rotating and translating the protein should be smooth and respond quickly to pointer events.
The PyMOL Gallery contains other representative examples that take advantage of various visualization and processing workflows. Also, you can find many demos (choose Wizard, Demo).
Under the Sculpting demo, you can show the pointer latency between the client and server.
Finally, look at ray tracing. From the PyMOL wiki, take a chemical structure and render each frame with ray tracing to produce a video. On the Tesla M60 with Quadro features enabled, the total render time was approximately 1 minute.
Scalability
As I mentioned previously, the framebuffer redirection protocols have a feature set to create multiple virtual sessions per node. A virtual session is not necessarily tied to a single user either. In other words, the number of independent virtual sessions is limited by the total amount of GPU frame buffer memory used in all sessions per GPU. Thus, it’s possible to scale horizontally by increasing the number of G3 instances, or vertically by using larger instance types in the G3 family.
Summary
The G3 instance type is purpose-built to provide a managed high-end professional graphics infrastructure for visual computing needs. With NICE DCV, we can take advantage of NVIDIA Quadro software features for a range of applications including drug discovery and VFX rendering. It should be noted that this deployment will also work with the new 8K and 12K monitors that are now commercially available. Connected with the AWS high performance network backbone, the instance can become an integral part of your graphics workload pipeline. Now, you can power-up and deliver your applications to teams working anywhere in the world.