AWS Spatial Computing Blog

VR Virtual Desktop Prototype on the Meta Quest

There is a lot of attention on virtual reality (VR) these days stemming from advances in hardware, collaboration on emerging standards, and availability of new software libraries, tools, and frameworks. What is most exciting is the invention and innovation that happens under these conditions.

Amazon Web Services (AWS) strives to enable the community, customers, and partners to bring new use cases and ideas to life through rapid prototyping and experimentation. It is invaluable to progress a theoretical discussion into something users can experience, turning speculation into real, actionable feedback. As a bonus, building something is an effective way to learn AWS services.

Meta is at the forefront of VR hardware and technology, pushing the boundaries in VR. Meta unveiled the support for 2D apps and PWAs (Progressive Web apps) in 2021, with a vision of an infinite office with infinite screen space for multiple apps. Users can make virtual screens to fit 2D apps, capable of freely resizing and multitasking, and with the standalone power of their device, they can take their office anywhere. With multitasking, reshuffling, and resizing capability of 2D apps, users can create a perfect monitor for them. Combining this with color passthrough, and input support with Bluetooth mouse and tracked keyboard, users can create the perfect environment for productivity.

Over the past few months, Josh Burns and I had been in touch with our co-author, Abhijit Srikanth (Partner Engineer at Meta), about how VR could play a part in enterprise productivity. The conversation went in a few different directions, thinking about different personas, their use cases, and ways of being productive in extended realities. VR empowers users to work in a 3D context, but users also rely on common desktop applications throughout the day. This led to the question – How can we bring our desktops to VR? Meta Quest 2 and Meta Quest Pro devices both support desktop streaming from computers on the local network; could the desktop be a virtual desktop in the cloud, streamed to the headset? Virtual desktops in the cloud offer flexible resource configurations when needed on-demand. What would the ideal user experience be like to manage and access virtual desktops?

A prototype presented itself, and Meta was on board to explore the idea to blend the two worlds of cloud-based virtual desktops and VR. Using the 2D PWA technology on Quest, Josh and I set to prototype a PWA for Quest to create, manage, and connect to virtual desktops in VR, solving the barriers of productivity on-the-go. Here’s a demo video that shows what multitasking in VR could look like with a sneak peak of the remote desktop prototype –

Figure 1 shows a high-level architecture of the prototype we built. The rest of the article walks through aspects of the prototype, its supporting architecture, and why certain design decisions were made.

Figure 1: High level architecture

VR virtual desktop prototype architecture composed of a client application, an API created with Amazon API Gateway, a backend Amazon DynamoDB database, and AWS Secrets Manager to store desktop passwords. Client application authenticates with Amazon Cognito and uses the provided JWT to authorize access to API resources. API backend uses AWS Lambda to carry out requests such as creating and managing virtual desktops.

Desktop streaming with NICE DCV

To address the core requirement of desktop streaming, we chose to use NICE DCV, a high-performance remote display protocol that provides a secure way to deliver remote desktops and application streaming to any device. This means customers can run compute and graphics demanding applications on remote Amazon Elastic Compute Cloud (Amazon EC2) instances and stream the pixels to lower powered clients. Additionally, NICE DCV streams to a web client or custom web apps built using React components from an accompanying Web UI SDK. This was ideal for the prototype because the Quest device supports Progressive Web Apps (PWAs).

NICE DCV also provides configurable security features to satisfy common enterprise requirements. While prototypes are considered non-production and meant to focus on the art of the possible, security is still a priority. A key point to note about NICE DCV is it streams pixels, not geometry data or data files. Data remains on the server. Additionally, NICE DCV uses AES-256 TLS encryption to encrypt pixels and user inputs in transit between the client and server.

When establishing a session to a host, NICE DCV can be configured to integrate with an external authentication solution (see fig 2). The external authentication configuration we implemented did two things: authorize remote session creation and confirm ownership of the target desktop. When a NICE DCV instance receives a connection request, it expects to be passed an auth token (JSON Web Token (JWT)) from the client application. It then passes the auth token to an authentication service implemented with Amazon API Gateway (API Gateway) and AWS Lambda. The backend Lambda function validates the auth token with Amazon Cognito, the identity provider, issuer of the token, and user directory for the client application. Additionally, it compares user information within the JWT against virtual desktop ownership information stored in a backend Amazon DynamoDB user table. The important takeaway here is NICE DCV gives you options to customize external authentication to your requirements.

Figure 2: NICE DCV session security

Client app passes user's JWT as a query parameter when connecting to NICE DCV host. Connection is TLS encrypted. Header configurations allow rendering DCV web client in an iframe. External authentication sends POST message with JWT to custom authentication service behind Amazon API Gateway. Authentication service validates user information contained in the JWT and checks whether virtual desktop is owned by the user.

Management API

The API Gateway also serves as the management API for creating and managing the state of a virtual desktop – e.g., create, start/stop, and list provisioned desktops. When a user creates a virtual desktop from the client app, the API request invokes a backend Lambda function which launches and bootstraps an Amazon EC2 instance based on a NICE DCV AMI (Amazon Machine Image) from the AWS Marketplace. There are many variants of the AMI to support different operating systems, processors (e.g., x86, ARM), and Amazon EC2 instances (e.g., graphics intensive G-series like the G5 or non-GPU instance like the C5). To simplify the choice for the end user, Josh and I created resource bundles which groups OS and Amazon EC2 instance types for users to select from. These bundles were defined and stored in DynamoDB.

Instance bootstrapping

When Lambda launches a new virtual desktop for a user, the function submits a PowerShell or shell script depending on the OS as Amazon EC2 user data (see Run commands on your Linux instance at launch or Run commands on your Windows instance at launch in the Amazon EC2 documentation). These scripts configure a local user account and NICE DCV software settings. Local user passwords are stored in AWS Secrets Manager (Secrets Manager) and encrypted at rest, with keys managed by AWS Key Management Service (AWS KMS). Auditable access history and role-based access for secrets are additional security benefits of Secrets Manager.

For NICE DCV, the script does a few things: downloads and installs a valid certificate stored in Amazon Simple Storage Service (Amazon S3) to remove client security warnings (out of the box, NICE DCV comes with self-signed certificates); configures external authentication as described earlier; associates the default session with the newly created local user; and configures HTTP response headers to support embedding the NICE DCV web client in an iframe. More on this later.

AWS Systems Manager (AWS SSM), which lets customers apply automation runbooks and run commands on instances was also considered. However, since user data runs at instance launch, it was chosen to minimize the time to deliver a virtual desktop to a user, once requested.

Infrastructure deployment

All backend infrastructure is provisioned with the AWS Cloud Development Kit (AWS CDK), a framework to define AWS infrastructure as code using common programming languages like TypeScript, JavaScript, and Python (the full list can be found in the AWS CDK FAQ). AWS CDK is a close relative to AWS CloudFormation (CloudFormation) because AWS CDK code synthesizes and deploys as CloudFormation templates, which allow for predictable, repeatable deployments with support for rollbacks and change sets.

AWS CDK libraries provide developers with constructs which can be thought of as a component deployed on AWS. This could be a single AWS resource or a higher-level abstraction that combines related AWS services. These are building blocks that are grouped as a Stack. One or more Stacks make up an AWS CDK App. Being able to apply programming concepts like loops and inheritance when expressing your AWS resources as code is incredibly powerful and saves a lot of time compared to building CloudFormation JSON or YAML formatted templates by hand; especially when you see that tens of lines of AWS CDK code could output a CloudFormation template hundreds of lines long.

Client application

The prototype frontend is a React-based, progressive web application (PWA) converted to an APK using a Meta packaging tool (see PWA Tools and Packaging in the Meta Quest documentation). PWAs on Meta Quest are powered and run by the same rendering engine as the Meta Quest Browser, based on Chromium. The packaged app can be sideloaded (i.e., locally installed) on the Quest device using the Meta Quest Developer Hub (MQDH), making it available in the device’s app library for testing.

When developing the application, AWS Amplify makes it easy to use an existing Amazon Cognito user pool and identity pool. The application uses a Cognito user pool as the user directory for authentication and authorization to the management API (see Accessing resources with API Gateway and Lambda after sign-in). AWS Amplify libraries include an AuthClass to help keep the user’s token fresh to authorize API access. It is worth noting that Amplify can also provision backend resources. This prototype uses AWS CDK for all resource provisioning for consistency. The application is deployed with AWS Amplify hosting to make it accessible through Amazon CloudFront, a content delivery network.

The look and feel of the UI is based on Material UI. Another version of the app is under development that uses the Cloudscape Design System. If you work on UIs, check out the components overview to see what’s available in the library.

Rendering the desktop client

To render the NICE DCV session, there are a couple options: embed the NICE DCV web client in an iframe or build a custom experience using the Web UI SDK. The iframe approach was the faster option during development. This requires additional HTTP response headers to be set on the NICE DCV host to allow the iframe to be embedded as noted in the documentation. Specifically, the response headers are web-x-frame-options and web-extra-http-headers; the latter to set a content security policy specifying the frame ancestor.

The Web UI SDK is a JavaScript React component library that interacts with the remote stream from the NICE DCV instance. This would be an option worth exploring if you wanted to integrate the NICE DCV stream into a custom web experience.

Demo Video

Conclusion

Overall, the prototype was a success and delivered a virtual desktop experience in VR. The final product had some rough edges, but it is a foundation to learn and iterate on; it was a prototype after all.

Here are some observations and follow-on questions from the prototype –

  • Bluetooth peripherals (keyboard, mouse) were essential for navigation and input (e.g., entering complex passwords). Could there be a better login experience when accessing apps and systems in VR? This is potentially another prototype for a VR optimized auth experience that could integrate with existing auth systems and identity providers.
  • Don’t have a huge monitor? No problem, the virtual desktop app window can be resized in your VR environment. It is quite nice to have the flexibility, but it also raises the question, is a full desktop needed? Would specific apps be a better user experience since VR offers so much more real estate to work in?

Prototypes open minds and show what is possible, and this project did that. It enabled testing of a novel approach to enterprise productivity. It raised questions about how to improve productivity as VR becomes more integrated into the future of work. If you have ideas, we would love to hear about them in the comments.