AWS Media Blog

Building a Virtual Assistant using AWS – Part 2

In part 1 of this blog series, you learned how to build the required service to support an Amazon Sumerian scene. You used key AWS services such as Amazon Lex and Amazon Cognito. In part 2, you will learn how to create an Amazon Sumerian virtual assistant and integrate Google Maps into your scene, as well as explore other important components such as State Machines, Host behaviors, Entities and Script.

By end of this post, you will have your own virtual assistant ready to find local attractions and businesses.

In part 1, we set up a partial Amazon Sumerian scene with a Host, Cristine, and an HTML3D entity. Now we will add a behavior on the Host and write custom scripts to complete our demo.

Step 1: Create State Machine Behavior

In this step, we will create a State Machine to our Host to trigger the Amazon Lex bot upon user initiation.

  1. With the Host entity still selected, open the State Machine component and add a new behavior.
  2. Rename the behavior to “Bot Behavior”.                                                                                                                                                                                                                                                                                              
  3. Rename State 1 to “Start”.
  4. Choose Add Action, then search for and add the AWS SDK Ready action.                                                                                                                                                                                                                           

Adding States

In the following steps, we will create several states and add actions. We’ll follow a similar process:

  • Add a state.
  • Rename the state.
  • Add actions.
  • Adjust relevant action properties.

State: Wait for Input

This state records your voice through your default microphone. Releasing the spacebar will transition to the next state.

  1. Add a new state.                                                                                                                                                                                                                                                                                                                                           
  2. Rename the state to “Wait for Input”.
  3. Add the Key Down action.
  4. Click inside the Key Property text box and press the spacebar. This sets the Key Down action to the spacebar.                                                                                                                                                                  

State: Start Recording

This state records your voice through your default microphone. Releasing the spacebar will transition to the next state.

  1. Add a new state.
  2. Rename the state to “Start Recording”.
  3. Add the Start Microphone Recording action. Add the Key Up action and change the Key to Space.                                                                                                                                                                                       

State: Stop Microphone

This state stops recording.

  1. Add a new state.
  2. Rename the state to “Stop Recording”.
  3. Add the Stop Microphone Recording action.                                                                                                                                                                                                                                                                                           

State: Process with Amazon Lex

This state will send your audio recording to be processed by Amazon Lex and send a speech response back to Amazon Sumerian.

  1. Add a new state.
  2. Rename the state to “Process with Amazon Lex”.
  3. Add the Send Audio Input to Dialogue Bot action. Do not select the Log user input or Log bot response check boxes.                                                                                                                                                       
    Note: Use the two checkboxes “Log user input” and “Log bot response” to help debug your scene. These checkboxes will print on your browser’s console the microphone input                        as understood by Amazon Lex and the response from the Amazon Lex bot, respectively. If the user input is blank, check that you have granted the browser access to your                                microphone. For those who have used the Amazon Lex API, these two are printing the inputTranscript and message properties from the object returned from lex.postContent(). Refer to the browser’s documentation on opening the console for Chrome and Firefox.

State: Play Speech

This state initiates the Speech component, which uses a speech sent from the Amazon Lex bot.

  1. Add a new state.
  2. Rename the state to “Play Response”.
  3. Add the Start Speech action.
  4. In the Start Speech action, select the Use Lex Response check box.

    This state transitions back to Wait for Input to cycle again to complete the conversation tree built in Amazon Lex.

Adding Transitions

Now we need to add transitions for the State Machine. Create transitions by clicking a state and dragging the arrow to another state.

  1. Start with the state we made first (AWS SDK). You should make transitions in the order that you created the states.
  2. On the Process With Amazon Lex state, drag the transition from the On Response Ready output.                                                                                                                                                                                               
  3. The final state, Play Response, should transition back to the Wait for Input state.
  4. Set the Start State to Set as Initial State.

The final graph will look like the following:

Step 2: Adding a Script Component and a Custom Script on Host Entity

  1. With the Host selected, go to the Inspector panel, choose Add Component and then select Script.                                                                                                                                                                                             
  2. You can see a new Script component tab in the Inspector panel. To add a script to the Script component, click the + button.                                                                                                                                             
  3. You are shown several preset scripts. However, for this tutorial, we want to write our own script. Choose Custom.                                                                                                                                                               
  4. In the Script component, click the edit button (pencil icon) next to the script that you just created. The script editor opens.                                                                                                                                             

Add some code to the script

The script has three functions in it: setup, update, and cleanup. For our simple script, we only need to worry about the setup and cleanup functions.

  1. Replace the following code with the code inside the setup function:
    function setup(args, ctx) {
    	ctx.onLexResponse = (data) => {
    		if (data.dialogState === "Fulfilled") {
    			sumerian.SystemBus.emit('searchOnMap', data.slots.localBusiness);
    			console.log(data.dialogState);
    		}	
    		console.dir(data);
    	}
    sumerian.SystemBus.addListener(`${sumerian.SystemBusMessage.LEX_RESPONSE}.${ctx.entity.id}`, ctx.onLexResponse);
    }
    

    Explaining the above code:     

    – First sumerian.SystemBus.addListener is listening for LexResponse and once the response is received this will call ctx.onLexResponse = (data) => function.

    – This piece of code checks if LexResponse’s dialog state is fulfilled and if yes then it will emit a message called searchOnMap and also passes the data.slots.localBusiness value    (which can be restaurants, malls, shops, etc. The keyword you asked to Lex).

    updateMap (created in the next step) is listening for searchOnMap message. Further processing will be handled by that script.

  2. Insert the following code inside the cleanup function:
    sumerian.SystemBus.removeListener( `${sumerian.SystemBusMessage.LEX_RESPONSE}.${ctx.entity.id}`, ctx.onLexResponse);

    This line of code removes the listener of the Amazon Lex channel.

  3. Rename the script file to “LexResponse” and save the file, as shown in the screenshot below.                                                                                                                                                                                                                                                                                

Step 3: Adding a Script Component and a custom Script to display Google Maps

In this step we will add script component on HTML 3D entity. With HTML3D entity selected, create a custom script as we did in the previous step.

We will update setup function and cleanup function in the script file.

  1. Add the below code in setup function.
    function setup(args, ctx) {
    	const mapIframeCode = '<iframe src="https://www.Google.com/maps/embed?pb=!1m16!1m12!1m3!1d3022.617540796677!2d-73.98785308479034!3d40.74844047932796!2m3!1f0!2f0!3f0!3m2!1i1024!2i768!4f13.1!2m1!1sSEARCH_TERM!5e0!3m2!1sen!2sus!4v1550399525495" width="600" height="450" frameborder="0" style="border:0" allowfullscreen></iframe>';
    	
    	const mapElement = document.getElementById('map');
    	
    	ctx.onSearchOnMap = (searchTerm) => {
    		const mapInnerHtml = mapIframeCode.replace('SEARCH_TERM', searchTerm);
    		mapElement.innerHTML = mapInnerHtml;
    		
    		console.log(`Searching for ${searchTerm}`);
    	};
    	
    	sumerian.SystemBus.addListener('searchOnMap', ctx.onSearchOnMap);
    	
    	// update map on opening to no search term
    	const mapInnerHtml = mapIframeCode.replace('SEARCH_TERM', '');
    	mapElement.innerHTML = mapInnerHtml;
    }

    Explaining the above code:

    sumerian.SystemBus.addListener is listening for searchOnMap message and once it receives that it will call ctx.onSearchOnMap = (searchTerm) => function.

    – This function will replace the searchTerm with the data.slots.localBusiness value which was passed from LexResponse script in the iframe tag.

    – And then the Google map is updated in the Sumerian scene.

  2. Add the below code in cleanup function.
    sumerian.SystemBus.removeListener('searchOnMap', ctx.onSearchOnMap);
  3. Rename the script file to “updateMap” and save the file.                                                                                                                                                                                                                                                                                                                                                    

NOTE: The Google Maps location in the above iframe tag corresponds to “Manhattan” in New York City. To get your customized location with iframe tag please follow the below steps.
Open Google Maps

  • In the search bar of Google Maps enter the location of your interest, like San Francisco.
  • Once you enter “San Francisco” in search bar and hit enter, Google Maps will show that particular location.
  • Replace “San Francisco” in search bar with “restaurants” or “shops”. Let’s say we replace with restaurants. The Google maps will show some restaurants with pinned locations.
  • Click on the menu icon in the search bar and go all the way down to share and embed map and click on it.
  • Select “Embed Map” and copy the HTML code.
    • Your iframe tag will look like below:
      • <iframe src=”https://www.google.com/maps/embed?pb=!1m16!1m12!1m3!1d50471.57291024247!2d-122.47514033430987!3d37.75549881339483!2m3!1f0!2f0!3f0!3m2!1i1024!2i768!4f13.1!2m1!1srestaurants!5e0!3m2!1sen!2sus!4v1550605938845″width=”600″height=”450″frameborder=”0″ style=”border:0″ allowfullscreen></iframe>
  • Replace “restaurant” word (highlighted in red) in the iframe tag with “SEARCH_TERM”.
    • Your final iframe tag should look like below:
      • <iframe src=”https://www.google.com/maps/embed?pb=!1m16!1m12!1m3!1d50471.57291024247!2d- 122.47514033430987!3d37.75549881339483!2m3!1f0!2f0!3f0!3m2!1i1024!2i768!4f13.1!2m1!1sSEARCH_TERM!5e0!3m2!1sen!2sus!4v1550605938845″width=”600″height=”450″ frameborder=”0″ style=”border:0″ allowfullscreen></iframe>
  • Now paste this iframe tag in the “updateMap” script file which was created in the above step.

Test Your Scene:

Your scene is ready to test. Play the scene. With the scene in play mode, hold down the space key while you talk. The following utterances are available for you to test:

  • Show me {localBusiness}
  • Where is {localBusiness}
  • Where are {localBusiness}

Where {localBusiness} can be

  • Restaurants
  • Shops
  • Malls
  • Theaters

Also, if you use below utterances:

  • Hello
  • Hello Cristine
  • Open map

Cristine will ask you “What are you looking for?” and then you can ask for shops, restaurants, malls, etc.

Now publish and share your scene!

Conclusion:

The majority of digital concierge systems are used in the hospitality industry, but that is not to say that other industries can’t take advantage of them as well.  Any guest-facing industry can put the technology to work.

Hospitality isn’t the only sector that finds great convenience in a digital concierge. Any location, whether it is media & entertainment, retail or banking, can employ the technology to their advantage. The following are some additional uses for a virtual concierge in other industries:

  • Convey product information to a customer
  • Live streaming
  • Advertise amenities
  • Wayfinding
  • Promote vendor relationships

Another example is retail showrooms, where virtual products can be displayed on kiosks and digital screens. Users can view and rotate three dimensional versions of a building or a vehicle. Virtual Assistant can also be implemented in a medical operating room where the surgeon’s hand movements communicate something specific.

The intelligent virtual assistant, also known as the personal digital assistant, gives companies an opportunity to leverage their existing digital infrastructure to offer personalized assistance to customers and associates. By utilizing Amazon Sumerian, you can quickly build the above use-cases in no time without any prior knowledge of AR/VR or without setting up complex software and hardware which saves time, money and efforts.

Additional Reading

If you found this post useful, be sure to check out:

Leo Chan

Leo Chan

Leo Chan is a Senior Specialist Solutions Architect working with AR/VR technologies. He loves working at the intersection of Technology and Art and has built tools such as Maya and Houdini for the Film and Video Gaming industries and created block buster content with studios including Pixar and Electronic Arts.

Chirag Oswal

Chirag Oswal

Chirag Oswal is a partner solutions architect and AR/VR specialist working with global systems integrators. He works with AWS partners and customers to help them adoption of cloud operating model at a large scale. He enjoys playing sports and travel.

Jacob Smeester

Jacob Smeester

Jake is a Creative Content Specialist on the Amazon Sumerian team, focusing on building valuable educational content, both written and video, for Sumerian customers at all levels of expertise.