Partner Spotlight: Building Cloud-First Enterprise XR Applications

This blog post explores how to develop and deploy cloud-first enterprise extended reality (XR) applications that bring together people, data, and AI. The StreamRoom AI example application from Innoactive demonstrates how to combine cutting-edge technology to create a scalable, enterprise-grade, cloud-native solution for developing and deploying enterprise XR applications. These applications will allow users to meet, virtually, with remote individuals, load and visualize 3D data, all without a PC.

A screenshot of the StreamRoom AI application, showing a photorealistic car within a group of four meeting participants and a virtual AI assistant.

Figure 1: StreamRoom AI is a cloud-native XR application that allows users to collaborate while using Standalone XR devices or simply a web browser

The Dream of Spatially Uniting People and Data

Daniel Seidl, CEO of Innoactive, summarizes the visions of enterprise customers:

Virtual meetings with remote individuals across the globe
The ability to load large 3D datasets from anywhere in the organization’s network
Customized meeting experiences to support specific use cases
Using reliable, affordable and portable Virtual Reality (VR) headsets (no PC, no wires)
Allowing non-VR users to join meetings via their browsers
Global accessibility with a single click
The ability to incorporate an intelligent virtual AI assistant in meetings

Today’s Reality: StreamRoom AI

After five years of effort, Daniel’s vision is now achievable. StreamRoom AI is an example application built on Unity and hosted on the Innoactive XR Streaming Platform. It demonstrates how advanced technology can be combined to create globally scalable, enterprise-grade XR applications using a cloud-native approach:

Streaming PC-grade graphics to Standalone VR using NVIDIA CloudXR
3D/CAD asset run-time loading from AWS VAMS (visual asset management system)
Siemens JT Open Unity Toolkit library for importing JT files at run-time
Virtual collaboration using Photon Fusion
Personalized Avatars using ReadyPlayerMe
A ChatGPT-like AI assistant built with OpenAI API, Amazon Transcribe for speech-to-text, and Amazon Polly for text-to-speech

StreamRoom AI is a Windows-based Unity application using OpenXR, running on the Innoactive XR Streaming Platform, hosted on Amazon Web Services (AWS). Using AWS Local Zones and Amazon Elastic Compute Cloud (Amazon EC2) GPU instances (G4/G5), Innoactive offers a globally available, low-latency edge orchestration service that allows users to start a CloudXR session with one click.

Let’s break down the demo and dive into each core technology component that brought this virtual collaboration use case to life.

Streaming PC-Grade Graphics with NVIDIA CloudXR

While standalone VR devices are affordable and easy to use, many XR use cases require PC-grade graphics due to the complexity of the 3D data involved. NVIDIA CloudXR provides a seamless solution by allowing Windows-based PC VR applications to be rendered in the cloud and streamed to standalone VR devices. This approach eliminates the need for frequent application updates to support new VR devices on the market and enhances data security by preventing sensitive 3D data from being downloaded and rendered locally. Getting started with CloudXR is simple: The Innoactive XR Streaming Platform allows initiating a NVIDIA CloudXR session with one click.

Architectural overview of the NVIDIA CloudXR solution: A client/server system that allows PC VR applications to be streamed via various networks to XR devices

Figure 2:NVIDIA CloudXR is a client/server SDK that allows PC VR applications to be streamed via various networks to XR devices

Establishing a 3D Asset Database and Pipelines with VAMS

As most XR applications heavily rely on 3D assets, it is crucial to establish a scalable and manageable way to organize and access this data. The Visual Asset Management System (VAMS) is an open-source tool that serves as a central 3D asset database. It allows enterprises to configure 3D asset conversion pipelines on AWS infrastructure, providing the required 3D files to XR applications at scale.

VAMS is an enterprise-grade 3D asset database and pipeline orchestration system. It provides XR applications access to spatial data, coming from various 3D/CAD asset sources, e.g. such as PLM systems, 3D scanners and more. VAMS can also be configured to transcode and convert assets in any format, by setting up pipelines and workflows using 3rd party conversion tools. After setting it up and uploading assets, enterprise XR applications such as StreamRoom AI can load optimized 3D assets via the VAMS RESTful API. See how to deploy VAMS, using the VAMS Workshop.

An electric vehicle 3D asset shown in the GUI of VAMS

Figure 3: VAMS can be used to manage, convert and distribute various 3D assets to XR applications

Loading CAD Files with Siemens JT Open Toolkit

While many 3D assets are available in Unity-compatible formats, some valuable data still resides in PLM systems like Siemens Teamcenter or SAP Enterprise Product Development (EPD).

One way to load JT files from Teamcenter is to use Siemens JT Open Toolkit. It comes with two components:

JTLoader, a lightweight Unity plugin that can load a JT file (in JSON representation) into a Unity scene.
Second, a (lossless) JT-to-JSON converter that transcodes the JT file into JSON representation, so it can be loaded by the JTLoader plugin at run-time.

To automate and orchestrate this process, developers can setup a VAMS pipeline, meaning that when a user uploads (or a service pushes) a new JT file to VAMS, it is automatically converted to a JSON representation. Then that file can be downloaded into the Unity application and imported using the JTLoader plugin for Unity on demand.

SAP EPD customers can request a Unity integration toolkit (UIT) from SAP for the same need. For other 3D assets, StreamRoom AI uses the Trilib 2 library. Which supports run-time loading of OBJ, FBX, GLB, GLTF and other formats. Combining multiple loader libraries allows StreamRoom AI to render almost any asset that VAMS can provide.

A screenshot of the Unity-based StreamRoom AI application which displays a radial engine model in JT format as well as another meeting participant

Figure 4: StreamRoom AI rendering a JT CAD file, loaded with Siemens JT Open Toolkit from VAMS

Enabling Virtual Collaboration with Photon Fusion and ReadyPlayerMe

In order to add collaboration, Photon Fusion was integrated into StreamRoom. This allows users to join a session, see each other in the virtual space, sync their actions and talk to each other. To get started with Fusion and Industrial Metaverse use cases, Photon recently launched the “Industry Circle” offering that provides 1:1 support and access to Metaverse-related Unity templates.

To make the VR meetings more life-like, StreamRoom AI uses ReadyPlayerMe. Besides offering some pre-built Avatars which can be selected by users, users can personalize their own Avatar with ReadyPlayerMe Studio. Finally, the Avatar is saved as a persistent user setting, so the application remembers it for the next session.

A collaborative scene showing meeting participants in an XR application built with the Photon Metaverse template

Figure 5: Photon Fusion Metaverse template, available through the Photon Industries Circle membership

Adding a ChatGPT-powered AI Assistant

The recent wave of AI services will change how users interact with XR applications. Rather than creating a controller-based UI concept for spawning tools or interacting with notes, the StreamRoom application features an AI assistant that can give those tools to the participants on voice request. In the example application, the AI assistant named Paula responds to voice commands and can execute specific commands in the Unity application, such as to give a pen of a specific color to the user, or to generate ideas for a specific topic and put them in notes.

In order to implement this, the ChatGPT OpenAI REST API was combined with Amazon Transcribe and Amazon Polly, using the AWS SDK for .NET:

The user’s voice is transcribed into text using Amazon Transcribe
The text is processed via OpenAI API Chat Completion
A specifically designed prompt instructs the assistant to answer in specific JSON commands
The StreamRoom application interprets those commands (speak, give pen, create note, load asset …)
Finally, the commands are executed (e.g. blue pen is created) and the assistant’s voice is generated (via Amazon Polly voice synthesis)

Paula, the ChatGPT-based AI assistant in the StreamRoom AI application, presents ideas written on meeting notes to the meeting participants. In the background, a industrial robotic cell is shown.

Figure 6: Paula, a ChatGPT-powered AI assistant can generate ideas, put them on notes and give them to the meeting participants

XR Cloud Streaming with Innoactive Portal

The next step is to make StreamRoom AI available to users – as seamlessly as possible. The Innoactive XR Streaming Platform renders the StreamRoom AI application in the cloud and streams it using NVIDIA CloudXR to any Standalone VR headset, or simply their web browser. A StreamRoom meeting can now be started with one click, as all underlying technology is orchestrated by Innoactive:

Upload the application to the Innoactive Portal
Install the Innoactive VR client application to the standalone headset (e.g Meta Quest 2, Vive Focus 3 or Pico 4 Enterprise)
A 6-digit code pairs the headset with the Innoactive Portal
When a user clicks on “Run in Cloud”, this selects a suitable Amazon EC2 instance at a close-by edge location, deploys the application, and initiates the NVIDIA CloudXR session
After seconds, the session is ready, and the user gets notified to put on the headset
The user is in the VR meeting application, able to talk to other participants, all rendered in the cloud, streamed to the headset

Innoactive XR streaming website showing the rendered StreamRoom AI application in a collaborative car design review session. In the center there is a electric vehicle, surrounded by meeting participants and a AI assistant

Figure 7: StreamRoom AI application running on Innoactive XR Streaming Platform, using an Amazon EC2 G4 instance at the AWS Region Europe (Frankfurt)

Authentication via SSO

In order to allow secure access to users, Innoactive supports Single-Sign-On (SSO) via SAML, Google, OpenID Connect, or Azure AD. Additionally, Innoactive passes the user identity to the StreamRoom application, so it can display the username above the ReadyPlayerMe Avatar.

Innoactive Portal Single-Sign-On configuration screen with option to activate and set-up SSO with Azure AD, Google, OpenID Connect and SAML

Figure 8: XR administrators can set up Single-Sign-On in the Innoactive Portal organization settings

Sharing Meeting Links via Calendar Invite

The Innoactive XR Streaming Platform allows users to share links to applications that include custom parameters, such as the meeting ID or an asset ID that should be loaded at start. This allows StreamRoom meeting participants to join an XR meeting with one click, directly from a calendar invite. This link starts the CloudXR session, passes all parameters (including user’s identity obtained through SSO) to the Unity or Unreal application. Developers can read those and more launch arguments at the application start, set the Photon room name and load the requested asset accordingly. Refer to the Innoactive Knowledge Base on how to use launch arguments.

A meeting invite in Google Calendar with a embedded link that guides users directly to the respective XR application hosted on Portal. The link contains application ID, meeting ID and asset ID.

Figure 9: Using Innoactive XR Streaming Platform, users can join a StreamRoom AI meeting directly from a calendar invite

Accessibility for non-VR Users

In addition to VR headset support, the Innoactive XR Streaming Platform allows user to stream the session to their browser. In StreamRoom AI, non-VR users are able to navigate the meeting using their mouse and keyboard. The Innoactive XR Streaming Platform will supply launch arguments to the Unity app, so it can automatically adapt to load in VR or non-VR mode.

Scaling CloudXR Applications Internationally with 5G, AWS Local Zones, and AWS Wavelength

To achieve low-latency XR streaming, two components are required: a nearby edge instance to render the application and a fast broadband connection (50+ Mbps). In cases where a wired broadband connection is unavailable, XR streaming performs well through native 5G public networks, which are becoming increasingly available.

AWS Local Zones play a crucial role in reducing latency by providing additional data center locations that are in close proximity to users. Currently, the Innoactive XR Streaming Platform utilizes over 60 edge locations, including 29 Local Zones. With these resources, it can now offer low-latency CloudXR sessions to users in various locations across the US, Europe, and many parts of Asia.

Furthermore, AWS Wavelength can reduce latency further by offering AWS compute and storage services within communications service providers’ (CSP) 5G networks, eliminating the need for unnecessary network hops. AWS Wavelength is accessible through AWS as well as their partners Verizon (US) and Vodafone (EU), along with other partners worldwide.

As a final option, enterprises with large facilities can choose to invest in a local 5G private network and purchase dedicated local GPU compute clusters, allowing for connectivity and compute capabilities even in areas where public networks and resources are currently unavailable. This enables XR streaming to be accessible everywhere.

A world map of the AWS Local Zones and Regions that Innoactive utilizes.

Figure 10: World map of the AWS Local Zones and Regions that Innoactive utilizes for low-latency XR streaming

Powerful XR, accessible to everyone, everywhere

Through advancements in cloud technology and XR development, technology has reached a stage where powerful XR experiences are accessible to everyone, everywhere. By following the approaches and solutions presented in this blog post, enterprises can develop and deploy XR applications that enable seamless collaboration, data visualization, and AI integration. The combination of cloud infrastructure, streaming technologies, 3D asset management, and powerful AI assistants change the way how users work and interact in XR environments.

XR streaming via NVIDIA CloudXR plays a pivotal role in streamlining the deployment and accessibility of XR applications. The Innoactive XR Streaming Platform simplifies the Amazon EC2 instance orchestration, upload and deployment process, allowing users to get into XR application streams with a single click.

StreamRoom AI, powered by Innoactive Portal, showcases the potential of cloud-first XR development. With its extensive features, seamless integration of technologies, the ability to enhance collaboration, data visualization, and AI integration, Innoactive Portal empowers enterprises to build scalable, secure, and immersive XR experiences. To learn more about Innoactive XR Streaming Platform and how it can revolutionize enterprise XR application deployment, visit Innoactive.

Additional resources:

AWS Partner Spotlight

Innoactive is a leading provider of innovative XR software solutions. Their platform empowers oranizations to create immersive training and simulation experiences that enhance learning and improve performance. With a focus on enterprise applications, Innoactive.io enables businesses to leverage the power of XR technology to streamline training processes, reduce costs, and drive operational efficiency.

Learn more about Innoactive on AWS »

AWS Spatial Computing Blog