AWS Media Blog

How-to: Create a Virtual Trainer With Amazon Sumerian Virtual Reality and Amazon Machine Learning – Part 2

Guest post by Chirag Oswal and Saurabh Shrivastava, AWS Partner Solution Architects

In part 1 of this blog series, you learned how to build the service environment in support of an Amazon Sumerian scene using key AWS services such as Amazon Lex, Amazon Cognito, Amazon DynamoDB and AWS Lambda. In part 2, You will learn how to create your virtual trainer using Amazon Sumerian and explore important component like Sumerian entities, State Machine, host behavior , Speech and Script component.

By end of this post, you will have your own virtual trainer ready to teach about AWS services used in this post.  You can interact with AWS trainer by opening below Amazon Sumerian scene in the latest version of Chrome or Mozilla browser. This post will teach you how to build the below scene.

Learn AWS Services with Virtual Trainer

Creating Your Own Virtual Trainer Experience with Amazon Sumerian:

Let’s get started on creating your own virtual trainer scene.

Navigate to the Sumerian Dashboard  from the AWS Console and choose an empty scene template from the Create scene from template section. You can give your scene a name and click on Create. Here we are naming it “VirtualTrainer” as shown in the below screenshot:

Open the Asset Library by choosing Import Assets above the canvas.

Go to the left panel and import the zip file “virtualtrainerdemo-bundle-sumerian.zip” provided in the StarterPack by clicking on the Browse icon as shown below:

After importing, your Sumerian editor will look like this:

To access other AWS services from Sumerian you need to configure Cognito Identity Pool ID. With the project name selected on Entities panel, expand the AWS Configuration component in the inspector panel. Insert the Cognito Identity Pool ID which you noted while setting up the required AWS service using AWS CloudFormation. Refer to Configuring AWS Credentials for Your Amazon Sumerian Scene for more details.

In order to have the host look in the direction of audiences, select Fixed Camera.  Once the scene is loaded, select “Fixed Cam” entity and navigate to inspector panel. Under the camera component, select/check the Main Camera checkbox as shown below. To learn more about lights and camera, refer to the Camera and Light Tutorial.

With the Luke entity selected, navigate to the inspector panel and make sure Point of Interest and Dialogue properties for the Sumerian host are set as shown below:

In Dialogue setting, you are configuring the Amazon Lex bot which the Sumerian host will use for conversation.

Explore State Machine Behavior

A State Machine is the system to build behaviors, which are changes to entities initiated by a user. Learn more with the State Machine Basics Tutorial.

You will see State Machine Behaviors are already configured on the below entities. We will explore them further and get familiarized with each entity behavior in this section.

Let’s explore our main state machine behavior associated with Sumerian host Luke, which you can explore by selecting the Luke entity and go to the State Machine section in inspector panel. Here you can click edit on Behaviors name “SceneFlow” as shown in the below screenshot:

Since this is a large state machine, we will explain it in parts.

State Machine Start section:

In this section you will learn about the initialization section of state machine as shown in the below screenshot:

Start State: This is the start state of our Scene Flow, which is configured to wait for 2 seconds before the host starts speaking.

Intro Speech: This is a speech file giving an introduction about today’s lecture. You will find multiple speech files configured here to explain each part of training. Learn more about the speech component in this Amazon Sumerian tutorial.

Emits ShowAwsDiagram: This state will emit the message “ShowAwsDiagram”, which brings the AWS Diagram into the screen. This state also has a speech file which will give instructions on how to proceed to the next steps. Learn more about Emits and Listen events in the Amazon Sumerian Tutorial.

State Machine Listen events section:

Consider this as the heart of our state machine. All the click-related events happen in this part of the state machine.

Listen: This state has five listen actions for various messages emitted from different entities like Amazon Sumerian, Amazon Polly, Amazon Lex, AWS Lambda, and Amazon DynamoDB. When you click on the Sumerian entities on an AWS architecture diagram in the Amazon Sumerian scene, it will emit a message called “PlaySumerian”, which will be listened by the listen actions in this state. Now this will transition to “Play Sumerian Speech” state which will play a speech describing Amazon Sumerian.

The Listen State also has “Microphone Recording” actions. On pressing and holding the space key you start recording the question and then that question is sent to Amazon Lex for processing.

State Machine Amazon Lex Interaction section:

This is the part where all the Amazon Lex interactions happen. The user sends request to Amazon Lex and the Amazon Lex gives back the response using AWS Lambda and Amazon DynamoDB.

Process With Lex: This state takes the user request and passes to Amazon Lex for further processing. You will find  two more listen actions in this state, which listens for the messages emitted by the Amazon Lex response. Listen actions will listen for appropriate messages and will transition to either “Play Lex Response” or “Listen” state.

Play Lex Response: This state will play the Amazon Lex response and an additional speech file which will give you more instructions on how to proceed in the demo.

Exploring AWS Architecture Diagram Animation

The animation is to bring the AWS Architecture diagram into the scene. To create this animation, we have pre -configured state machines on the following entities.

  1. Architecture Diagram
  2. Sumerian
  3. Polly
  4. Lex
  5. Lambda
  6. DynamoDB

You will see the below state machine behavior for animation in each of the above entities.

Hide State: Initially when the Sumerian scene starts, we are hiding the AWS diagram using hide action.

Listen State: This listens for “ShowAwsDiagram” message emitted in the State Machine start section.

ShowAwsDiagram: After listening, this will start an animation using show action and show the AWS architecture diagram which was hidden initially.

Explore Script file for Lex response

We have preconfigured a script file in Luke entity. With the Luke entity selected, navigate to the inspector panel and select script component. Open the LexResponse script by clicking on edit as shown in below screenshot and code editor will open with script:

This code is emitting messages based on the Lex dialog state. These messages are listened by the actions under “Process With Lex” state in State Machine.

Time to Play Your Scene!

Test your scene by choosing the play button as shown in the below screenshot. Watch and follow along as your Host presents your speech. You can publish and share your scene by click on the publish button at top left corner.

Conclusion

Here we are addressing the educational use case using Sumerian. But this idea can be extended to many different industries and verticals where a lot of money and time is being spent on training. For example, you can create an animated educational tutorial for students to make a subject more interesting with interactive learning or listen to morning news with a different Sumerian host Avatar of your choice in a regional language. By utilizing Amazon Sumerian, you can reduce a lot of overhead and it can really help transform monotonous learning into an interactive and exciting experience.

Additional Reading

If you found this post useful, be sure to check out:

About the Authors

Saurabh Shrivastava is a partner solutions architect and big data specialist working with global systems integrators. He works with AWS partners and customers to provide them with architectural guidance for building scalable architecture in hybrid and AWS environments. He enjoys spending time with his family outdoors and traveling to new destinations to discover new cultures.

Chirag Oswal is a partner solutions architect and AR/VR specialist working with global systems integrators. He works with AWS partners and customers to help them adoption of cloud operating model at a large scale. He enjoys video games and travel.