Adding Conversation AI Capabilities to Amazon Chime SDK

This is a guest blog written by Toshish Jawale, CTO and co-founder of Symbl.ai. The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Symbl is a conversational AI platform that provides developers with capabilities to apply contextual comprehension of natural human conversations to applications. Symbl recently made available a Conversational AI adapter for the Amazon Chime SDK for JavaScript.

Conversation intelligence (AI) is enabled in applications created for a wide array of use cases and verticals, to effectively extract and analyze key insights from human-to-human conversations. For example, sales enablement platforms are capable of automatically detecting customer intent, generating action items, and scheduling follow up meetings during a virtual sales call. This helps in accelerating sales efforts, increasing conversions, and driving business growth. Another example could be a tele-health application dynamically surfacing and saving key patient information such as instructions or follow up recommendations during a patient call. This helps improve the patient experience and minimizes the risk of miscommunications.

With Symbl, developers can enable applications driven by conversation or speech-specific core functionalities to extract contextual insights to help generate key business intelligence and enhance user experiences. Symbl provides a domain agnostic AI that does not require upfront model training efforts. Developers are able to ‘plug and play,’ helping to save time and cost, and can focus on building applications without ever having to maintain or build any machine learning models.

The Symbl Conversational AI Adapter allows developers to add various conversation AI capabilities in applications built with the Amazon Chime SDK for JavaScript . Out of the box capabilities include the following:

Automatically detect Action Items and Follow Ups from calls
Automatically detect Questions asked in the call
Detect contextually relevant Topics of Discussion with sentiment
Auto-generate transcriptions of calls, separated by Speakers
Provide live closed captioning and real-time insights detection
Generate advanced speaker level analytics
And more

Any applications with the Audio/Video communication features built with the Amazon Chime SDK for JavaScript can use this adapter in the client side web application. For example:

Sales enablement products that connect sales representatives and customers can automatically suggest the next steps to follow up on, provide suggestions to set up meetings automatically, show the sentiments by contextual topics discussed in the sales call, identify key questions that were raised during the call, and provide the full speaker separated transcripts.
Tele-health applications that connect doctors and patients from various regions and cultures can also add these capabilities, and can help bridge cultural barriers by showing live captioning and generating a record and follow ups.
E-learning products that conduct lectures or courses between teachers and students can automatically get auto-generated notes from each lecture along with AI-filters for indexing and easily searching through video content.

These are just a few of the use cases, but there are many other possibilities.

All of this intelligence is available in real-time i.e. while your call or meeting is in progress, in the adapter as well as accessible with simple REST API calls after the call is over. This gives you flexibility to build both in-meeting and post-meeting experience as per your specific use case and need. After a call is over, you will receive a few pre-built, customizable Conversation Summary experiences from Symbl.

Technical Overview

Symbl Conversational AI Adapter is an open source library that works along with the Amazon Chime SDK for JavaScript being used in your application. It is compatible to work with JavaScript and TypeScript or any other JavaScript based languages and frameworks (React, Angular, Vue etc.). The adapter integrates with Symbl’s WebSocket API to stream the audio in real-time and receives the output asynchronously in real-time over WebSocket. It can also work with multiple audio streams specific to each speaker in your app, which allows you to get highly accurate speaker separation in your transcripts and other insights detected by Symbl.

Here’s a high-level diagram to show logical flow in a typical client-server application with the adapter –

Client requests authentication token from Symbl with Symbl credentials configured in the server side application.
Application server requests access token from Symbl using Symbl’s Authentication API.
Symbl responds to the application server with an ephemeral access token after authentication is successful.
Application Server responds back the same token to Client.
Adapter uses the token to connect to Symbl’s WebSocket API and starts streaming audio of the participant to Symbl.
Symbl starts sending events to adapter in real-time as insights and transcripts are being generated as the call progresses.

Getting Started

You can get started with the Symbl Conversational AI Adapter to build various use cases in your application built on top of Amazon Chime SDK for JavaScript.

Note: Deploying and receiving traffic from the demos created in this post can incur AWS charges.

Prerequisites

You have read Building a Meeting Application using the Amazon Chime SDK. You understand the basic architecture of the Amazon Chime SDK for JavaScript and deployed a server-less/browser demo meeting application.
You have signed up and have a Trial or Active Symbl account. Note that Symbl accounts come with free 1000 minutes and you will need to upgrade to use more minutes beyond that or connect with our team to get you additional free credits.
You have basic to intermediate understanding of JavaScript.
Node.js v10+
NPM 6.11 or higher
Install the AWS CLI
Install the AWS SAM CLI

Adding Symbl Chime Adapter in your Web Application

You must install the Symbl Conversational AI Adapter in your web app project by adding it as dependency.

npm install --save symbl-chime-adapter

Symbl Authentication

The Symbl Conversational AI Adapter requires you to have a valid Symbl auth token. You need Symbl appId and appSecret to generate a valid token. You can find them on Symbl console.

You can generate a token using Symbl Authentication API. Here’s a sample JavaScript function:

const getSymblToken = () => {

  const res = await fetch(`https://api.symbl.ai/oauth2/token:generate`, {

        method: 'POST',

        mode: 'cors',

        cache: 'no-cache',

        credentials: 'same-origin',

        headers: {

          'Content-Type': 'application/json',

        },

        redirect: 'follow',

        referrerPolicy: 'no-referrer',

        body: JSON.stringify({

          type: 'application',

          appId: symblAppId, // You Symbl AppID from Symbl console

          appSecret: symblAppSecret, // You Symbl AppID from Symbl console

        }),

      });

  const result = await res.json();

  return result.accessToken;

}

Note: It’s not recommended to use your Symbl credentials in your client side code in the production, as it can be visible to anyone who is able to use your application. You should create a backend service to generate Symbl token and make sure that the appId and appSecret are safely stored in the backend system.

Initialization

After the access token has been set and your client has joined the meeting, you can instantiate the Symbl class.

The Symbl constructor takes two parameters and invokes start() method to start the processing through Symbl.

Constructor Parameters:

attendeeId: A unique identifier for a conversation participant
meetingId: A unique identifier for the meeting for which participants connect to the conversation

However, there are two optional parameters that give the meeting more context that can be passed into the constructor:

userName: The user’s name
meeting: The conversation or meeting name.

It is best practice to use the participant ID and meeting ID generated by your Amazon Chime SDK environment when a user connects. The meeting ID ensures that each participant is conversing on the same channel. In the Symbl serverless demo application we can see an example of how this information is generated and passed back to the client.

// Import/Require adapter

const {Symbl} = require('symbl-chime-adapter');

Symbl.ACCESS_TOKEN = getSymblToken();

const attendeeId = '<chime_attendee_id>';

const myMeetingId = '<unique_identifier_for_the_active_meeting>';

const userName = '<chime_userName>';

const meetingName = '<meeting_name>';

// Create an instance of a Symbl helper class

const symbl = new Symbl(

      { // Amazon Chime configuration

          attendeeId: attendeeId,

          userName: userName,

          meetingId: meetingId, // Randomly generated 

          meeting: meetingName,

      }, {

          confidenceThreshold: 0.5, // Confidence threshold above which the action items, follow ups and questions to be detected

          languageCode: 'en-US', //Valid language codes en-US, en-GB, en-AU, it-IT, nl-NL, fr-FR, fr-CA, de-DE, es-US, ja-JP

          insightsEnabled: true, // default: true - True if insights should be generated for conversation.

      }

  );

  // subscribe to real-time event publishers (see below)

  symbl.start();

Now you’ve successfully added Symbl Conversational AI Adapter in your Amazon Chime SDK based app. Below are the steps to get started on a few of the use cases that you can add in your app using the Symbl Conversational AI Adapter.

Adding Real-Time Contextual Insights

To receive insights like action items, questions and follow ups etc. in real-time, in your app, you can subscribe to the insights by registering your handler using subscribeToInsightEvents().

1.    

// Subscribe to real-time insights

symbl.subscribeToInsightEvents({

  onInsightCreated: (insight) => {

    console.log('Insight received: ', insight);

     // Creates a pre-defined insight element

      const element = insight.createElement();

      

      // Customize any styling

      element.classList.add('mx-auto');

      element.style.width = '98%';

      

      /** OR create your own element from the raw data in the insight object

      const insightData = insight.data;

      insight.element = document.createElement('div');

      insight.element.innerHTML = `

          <div style="width: auto; height: 400px">

              <h3 style="font-width: 400;">

                  ${insightData.type}

              </h3>

              <br>

              <h2>

                  ${insightData.text}

              </h2>

          </div>`;          

      **/

      

      // Retrieve container you wish to add insights to. Use your container id.

      const insightContainer = document.getElementById('insight-container');

      

      // Call add on the insight object to add it to DIV

      insight.add(insightContainer);

  }

});

Here’s an example of how pre-built insight elements should look like in your app:

Symbl also automatically detects date/time and person entities in follow-ups and action items or types of insights. You can use them to create various suggestions for actionable triggers. For example, setting a calendar invite from a follow-up suggestion. Here’s a code sample that shows how you can achieve that in your app.

// Subscribe to real-time insights

symbl.subscribeToInsightEvents({

  onInsightCreated: (insight) => {

    console.log('Insight received: ', insight);

    const insightData = insight.data;

    if (insightData.type === 'follow_up') {

      if (insightData.tags && insightData.tags.length > 0) {

         const tags = insightData.tags.filter(tag => !!tag.dueBy);

          if (tags.length > 0) {

              const {dueBy} = tags[0];

              // Follow up with a due date/time can be suggested as calendar invite or email follow up suggestion

          }

      }

    }

  }

});

Adding Live Captioning and real-time transcripts

You can add live captioning to your app by subscribing to subscribeToInsightEvents() method.

The closed caption handler has 3 callback functions:

onCaptioningToggled – Will be called whenever closed captioning is toggled on or off.
onCaptionCreated – Called whenever speech is first detected and a new captioning object is created.
onCaptionUpdated – Called when speech is subsequently detected

1.    

const captioningHandler = {    

    onClosedCaptioningToggled: (ccEnabled) => {

        //ccEnabled - boolean value indicating if closed captioning is enabled

        // Implement

    }, 

   onCaptionCreated: (caption) => {

        console.warn('Caption created', caption);

        // Retrieve the video element that you wish to add the subtitle tracks to.

        const activeVideoElement = getActiveVideoElement() as HTMLVideoElement;

        if (activeVideoElement) {

            caption.setVideoElement(activeVideoElement);

        }

    },

    onCaptionUpdated: (caption) => {

        const activeVideoElement = getActiveVideoElement() as HTMLVideoElement;

        subtitle.setVideoElement(activeVideoElement);

    }

};




symbl.subscribeToCaptioningEvents(captioningHandler);

The Symbl Conversational AI Adapter also lets you subscribe to properly formed transcription elements which can be useful to show the real-time transcription in your app by registering the handler using subscribeToTranscriptEvents() method.

const transcriptHandler = {

    onTranscriptCreated: (transcript: Transcript) => {

      // Handle transcript item

    }

};




symbl.subscribeToTranscriptEvents(transcriptHandler);

const transcriptHandler = {

    onTranscriptCreated: (transcript: Transcript) => {

      // Handle transcript item

    }

};

symbl.subscribeToTranscriptEvents(transcriptHandler);

Generating Meeting Summary URL

To generate a meeting summary URL, you need only to call the getSummaryUrl function of your Symbl instance. It uses Symbl’s Experience API to generate the supported pre-built summary experiences.

const meetingSummaryUrl = await symbl.getSummaryUrl();

Below is an example of the pre-built Video Summary UI that can be generated after the conversation is processed using the URL to the Video.

Retrieving data after the Meeting is Finished

Symbl’s Conversation API allows you to fetch all the analyzed data like insights, topics, sentiments, analytics, entities, transcripts etc. even after the meeting or conversation is over.

To extract any of the analyzed data from the Conversation API, all you need is a valid Symbl access token and the unique conversationId of your meeting or conversation processed using the adapter. You can find the conversationId of your call using the adapter by simply invoking the conversationId getter on the instance of the Symbl class.

This example shows how to retrieve a transcription of a meeting at a later time after it’s over.

const accessToken = 'eyJhbGciOiJSUzI1NiIsInR5cCI....'; // Symbl Access Token

const conversationId = symbl.conversationId; // Conversation ID of your meeting/call

const res = await fetch(`https://api.symbl.ai/v1/conversations/${conversationId}/messages`, {

    headers: {

        'Authorization': `Bearer ${accessToken}`

    }

});

const {messages} = await res.json(); // Array of messages representing the transcript of the meeting

You can use any of the other Conversation APIs to fetch different types of insights, such as topics, trackers, sentiments by topics or per message, entities or analytics generated by Symbl even after the call is finished. You can find details of all the endpoints in the Conversation API here.

Demo Meeting Application

You can refer to this GitHub repository for an open source demo application that uses Symbl Conversational AI Adapter with Amazon Chime SDK for JavaScript.

Deploying Demo Meeting Application

You can deploy the demo meeting applications with AWS Lambda:

Clone the project from GitHub.
git clone https://github.com/symblai/symbl-adapter-for-chime-demo.git
Navigate to following directory
cd demos/serverless
Create a .env file in the src directory under demos/serverless and make sure that it is updated with the Symbl Credentials like this -SYMBL_APP_ID=<xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>SYMBL_APP_SECRET=<xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Make sure you’re in demos/serverless directory, and run this command to Install npm dependencies
npm install
Run deployment script. This will create an AWS CloudFormation stack containing a Lambda and Amazon API Gateway deployment that runs the demo. Replace <my-bucket> with the name of Amazon Simple Storage Service (Amazon S3) bucket you’d like to use where the deployment will be stored, and replace <my-stack-name> with the name you’d like to give this deployment.npm run deploy — -r us-east-1 -b <my-bucket> -s <my-stack-name> -a meetingV2
After the above command finishes, it will output the URL that can be opened in the browser to start using the demo application.

Cleaning up

If you deployed the demo application using the above steps in AWS and don’t want to continue to incur AWS charges, you can clean up by deleting the AWS CloudFormation stack and resources created when deploying the demo app.

To delete the stack and its resources:

From the AWS CloudFormation console in us-east-1, select the stack that you created for this demo. Click Delete Stack. In the confirmation message that appears, click Yes, Delete. At this stage, the status for your changes to DELETE_IN_PROGRESS. In the same way you monitored the creation of the stack, monitor its deletion by using the Events tab. When AWS CloudFormation completes the deletion of the stack, it removes the stack from the list.

Conclusion

In this blog, we talked about what the Symbl Conversational API Adapter is and how it can be used with the Amazon Chime SDK to build Conversation AI applications to analyze free flowing human conversations. To learn more about Symbl, visit Symbl’s website. Here are a few important links to learn more about Symbl and its services.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Business Productivity