AWS Contact Center

CX Insights – Generative AI delivering results now

In the Summer of 2023, there were feasibility questions about if and when generative artificial intelligence (AI) would be able to deliver accurate, automated, conversational results to end-customers at scale. In a 2023 Gartner® Market Trend Report, Gartner makes the strategic planning assumption that “By 2030, the tech spend on virtual agents will exceed the tech spend on human agents, starting from a virtual agent tech spend of roughly 2% in 2023.”1 I believe, given what I am seeing less than a year later, that transition to virtual agents may happen even sooner.

Generative AI is delivering real customer experience (CX) results in production at enterprise scale sooner than expected. In early 2024 DoorDash deployed generative AI-powered self-service in their Amazon Connect contact center – this solution is fielding 100,000s of calls per day and driving large and material reductions in call volumes while significantly reducing escalations and driving down costs. “We’ve built a solution that gives Dashers reliable and simple-to-understand access to the information they need, when they need it,” says Chaitanya Hari, Contact Center Product Lead at DoorDash. “This has cascading positive impacts on our users and the platform as a whole, and we look forward to expanding to new use cases in the future.”

This begs the question: What were those feasibility questions from last Summer and what has changed since then?

Responsive? Could a virtual agent provide answers fast enough to deliver a human-like voice conversation that end-customers would appreciate and not bypass?
Just nine months ago, large language models (LLMs) – the core of generative AI – were taking far too long to return the accurate responses required for a useful voice conversation. In March 2024, Anthropic announced the Claude 3 Haiku LLM hosted in Amazon Bedrock that enables DoorDash to achieve the speed required to deliver a good customer experience – conversational responses delivered in 2.5 seconds or less. Given the rate of LLM innovation, responsiveness will continue to improve, rapidly translating into more human-like conversations.

Accurate? Could we trust the answers a voice-enabled virtual agent provides to customers?
Generative AI LLMs can be unpredictable – confidently providing wrong answers (“hallucinations”) to some questions. DoorDash leverages the Claude LLM’s integrated hallucination mitigation, abusive language detection, and accuracy guardrails.. The guardrails combine DoorDash’s existing website content, internal knowledge bases (Knowledge Bases for Amazon Bedrock) and output optimization (retrieval-augmented generation) to deliver more relevant and accurate responses.

Tested & monitored? How could we efficiently validate and maintain confidence in such a solution?
Given the massive number of permutations, having humans test a generative AI-powered, voice-enabled virtual agent solution would require weeks or months of ongoing staff time and cost. DoorDash used Amazon Sagemaker to build an automated test and evaluation framework – quickly drawing insights from A/B testing, evaluating key success metrics at scale, and validating LLM responses against ground-truth data. This framework delivers a 50x increase in testing capacity – from a small number of human tests to thousands of automated tests per hour. To monitor accuracy on an on-going basis, the solution uses a separate LLM to evaluate answers and raise alarms for follow-up action.

Anthropic is the AI firm delivering the LLM used in the DoorDash self-service solution. Dario Amodei, Anthropic’s CEO, states an AI scaling law: increased computer power and data exponentially boost AI system capabilities in a predictable manner. In his New York Times interview with Ezra Klein, Amodei explains the speed at which this scaling law predicts improvement in generative AI models – “the industry is going to get a new generation of models probably every four to eight months.”

Given the speed of progress in applying generative AI to CX, significant opportunities are anticipated to improve CX while driving down operational costs. The next question for many CX business leaders is “ok, I can see the reality and value of generative AI, and we have budget to experiment… where should we start?”

In the next CX Insights blog post, I will dive into generative AI innovation learnings from other enterprises including starting points, business value, team mindset, and technology. I’ll also discuss Amazon.com’s CX technology team that uses generative AI to both streamline the experience of millions of global customers and empower the 110,000+ CX agents that support them.

1 Gartner, Market Trend: Hyperscaler Use of GenAI to Disrupt the Cloud Contact Center Market, By Daniel O’Connell, Megan Fernandez, Khurram Shahzad, 21 September 2023. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

About the author

Matt Taylor, Author

Matt Taylor is a Principal Product Manager for Amazon Connect at Amazon Web Services (AWS) based in Seattle, Washington. He is focused on understanding our customers’ business and operational drivers, diving deep into how AI and other technology will impact contact centers and their end customers, and bringing these learnings together to drive change for our customers. Matt enjoys discovering new places, spending time with family, and reading about the human condition, great leaders, and science.