AWS Contact Center

AI Powered Speech Analytics for Amazon Connect (Preview)

One of the primary motivations for Amazon Connect is to make it easy for our customers to deliver better customer service at a lower cost. Applied Artificial Intelligence (AI) permits us to investigate new ways to raise the bar for customer service. Can an AI-based agent participate in the conversation and recommend actions or find other ways to assist the call center agent? Can we use automatic translation to help bilingual agents when they’re working in their non-dominant language? Can we simply highlight important facts so agents don’t have to write them down? In short, what can we do to make it easier for the call center agent to devote more time and attention to the caller and address customer needs? Taking it further, what can we do to learn from customer interactions so that call recordings are really used to improve customer service? Amazon Connect Live Media Streaming makes customer audio available during a call so we can explore these questions.

With Live Media Streaming, Amazon Connect can send customer audio to an Amazon Kinesis Video Stream for streaming transcription using Amazon Transcribe. The text that represents customer speech, a transcript complete with confidence scores and timestamps, can then be processed in parallel using services like Amazon Comprehend for sentiment analysis, and Amazon Translate for automatic translation. Using pre-trained services like Amazon Comprehend eliminates the data curation, model building, and training required to integrate custom machine learning models; it also makes data collection and justification easier when the team is ready to put in the additional investment.

At the minimum, a live transcript helps the agent by minimizing the time to record notes; more importantly, rather than capturing just the actions requested by the customer, a live transcript better reflects the voice of the customer. How often were competitors or competitive offerings mentioned? Were any pain points mentioned? How do the best agents communicate when customers call with problems? Beyond better notes, live transcripts allow post-call analysis that can offer insights, not just to contact center managers, but also to product owners. There are at least a dozen ways in which these building blocks can be assembled to increase efficiency for contact center agents while providing a better customer service experience. While we expect customers to experiment and find what works best, we’ve tried to distill what we’ve learned into something that customers can start using immediately.

The Solution

In this video, my colleague Yasser shows us how the AI Powered Speech Analytics for Amazon Connect solution can help contact center agents focus on the caller, be productive, and track customer sentiment. He’s using the default agent experience included with Amazon Connect, but recall that the experience was designed to be easy to incorporate into existing workflows. For example, customers often incorporate slightly modified versions of this experience in their CRM system.

Notice how the speech is transcribed to text so that the agent does not have to stop to take notes. Automatic translation helps to avoid misunderstanding when working in a different language. Also notice that special terms, such as names or dates, and phrases are tagged and highlighted. With keyword spotting, certain phrases can be configured to trigger recommendations for next best actions (e.g., “25% off the next purchase”). Finally, the real-time sentiment gauge allows the agent to track the conversation and manage customer sentiment.

Best of all, this information can be used to annotate the Contact Trace Record so the data is available to supervisors for analysis and feedback. This call is indeed being recorded for training purposes!

Figure 1: An example of how the AI Powered Speech Analytics solution can help the call center agent assess caller sentiment

How does it work

On the service side, the Start media streaming block in the contact flow signals Amazon Connect to push customer audio as audio/L16 content to a Kinesis Video Stream; this continues until either the call terminates or a Stop media streaming block is encountered. The contact flow (synchronously) invokes a “trigger” AWS Lambda function with information about the call and this function asynchronously invokes a “transcriber” function, which establishes a connection to Transcribe streaming and transcribes the available audio. The raw audio is saved as an Amazon S3 object and the asynchronous responses from Transcribe streaming are used to save transcript “segments” in an Amazon DynamoDb table. On the client side, a front-end web experience can use an API call over websockets to get the most recent results and enrich the results using services like Amazon Comprehend and Amazon Translate.

Figure 2: Architecture overview

Like other AWS Solutions, the deployment is orchestrated using an AWS CloudFormation template so all you need to do is to provide some parameter values and wait for the stack to be created. Once the solution is ready and the contact flow is active, you can look for those transcripts. Call transcripts and Contract Trace Records can be aggregated for analysis so you can get to those nagging mysteries like, “Exactly what prompts people to call in on Monday nights at 8 PM?” Feedback on recommended actions can be used to inform which approaches work best. All this, without nagging agents to do one more data collection activity.

Getting Started

To learn more, visit the solution page and request access to the preview.