AWS Machine Learning Blog

Use Amazon Lex to capture street addresses

Amazon Lex provides automatic speech recognition (ASR) and natural language understanding (NLU) technologies to transcribe user input, identify the nature of their request, and efficiently manage conversations. Lex lets you create sophisticated conversations, streamline your user experience to improve customer satisfaction (CSAT) scores, and increase containment in your contact centers.

Natural, effective customer interactions require that the Lex virtual agent accurately interprets the information provided by the customer. One scenario that can be particularly challenging is capturing a street address during a call. For example, consider a customer who has recently moved to a new city and calls in to update their street address for their wireless account. Even a single United States zip code can contain a wide range of street names. Getting the right address over the phone can be difficult, even for human agents.

In this post, we’ll demonstrate how you can use Amazon Lex and the Amazon Location Service to provide an effective user experience for capturing their address via voice or text.

Solution overview

For this example, we’ll use an Amazon Lex bot that provides self-service capabilities as part of an Amazon Connect contact flow. When the user calls in on their phone, they can ask to change their address, and the bot will ask them for their customer number and their new address. In many cases, the new address will be captured correctly in the first try. For more challenging addresses, the bot may ask them to restate their street name, spell their street name, or repeat their zip code or address number to capture the correct address.

Here’s a sample user interaction to model our Lex bot:

IVR: Hi, welcome to ACME bank customer service. How can I help? You can check account balances, order checks, or change your address.

User: I want to change my address.

IVR: Can you please tell me your customer number?

User: 123456.

IVR: Thanks. Please tell me your new zip code.

User: 32312.

IVR: OK, what’s your new street address?

User: 6800 Thomasville Road, Suite 1-oh-1.

IVR: Thank you. To make sure I get it right, can you tell me just the name of your street?

User: Thomasville Road.

IVR: OK, your new address is 6800 Thomasville Road, Suite 101, Tallahassee Florida 32312, USA. Is that right?

User: Yes.

IVR: OK, your address has been updated. Is there anything else I can help with?

User: No thanks.

IVR: Thank you for reaching out. Have a great day!

As an alternative approach, you can capture the whole address in a single turn, rather than asking for the zip code first:

IVR: Hi, welcome to ACME bank customer service. How can I help? You can check account balances, order checks, or change your address.

User: I want to update my address.

IVR: Can you please tell me your customer number?

User: 123456.

IVR: Thanks. Please tell me your new address, including the street, city, state, and zip code.

User: 6800 Thomasville Road, Suite 1-oh-1, Tallahassee Florida, 32312.

IVR: Thank you. To make sure I get it right, can you tell me just the name of your street?

User: Thomasville Road.

IVR: OK, your new address is 6800 Thomasville Road, Suite 101, Tallahassee Florida 32312, US. Is that right?

User: Yes.

IVR: OK, your address has been updated. Is there anything else I can help with?

User: No thanks.

IVR: Thank you for reaching out. Have a great day!

Solution architecture

We’ll use an Amazon Lex bot integrated with Amazon Connect in this solution. When the user calls in and provides their new address, Lex uses automatic speech recognition to transcribe their speech to text. Then, it uses an AWS Lambda fulfillment function to send the transcribed text to Amazon Location Service, which performs address lookup and returns a normalized address.

As part of the AWS CloudFormation stack, you can also create an optional Amazon CloudWatch Logs log group for capturing Lex conversation logs, which can be used to create a conversation analytics dashboard to visualize the results (see the post Building a business intelligence dashboard for your Amazon Lex bots for one way to do this).

How it works

This solution combines several techniques to create an effective user experience, including:

  • Amazon Lex automatic speech recognition technology to convert speech to text.
  • Integration with Amazon Location Service for address lookup and normalization.
  • Lex spelling styles, to implement a “say-spell” approach when voice inputs are not clear (for example, ask the user to say their street name, and then if necessary, to spell it).

The first step is to make sure that the required slots have been captured.

In the first code section that follows, we prompt the user for their zip code and street address using the Lex ElicitSlot dialog action. The elicit_slot_with_retries() function prompts the user based on a set of configurable prompts.

 
    # check for ZipCode code slot; if not available; elicit it
    zip_code = None
    zipCode = slot_values.get('ZipCode', None)
    if zipCode is not None:
        zip_code = zipCode['value'].get('interpretedValue', None)
    else:
        response = helpers.elicit_slot_with_retries( intent, activeContexts, sessionAttributes, 'ZipCode', requestAttributes)
        return response		
    # check for StreetAddress slot
    street_address = None
    streetAddress = slot_values.get('StreetAddress', None)
    if streetAddress is not None:
        street_address = streetAddress['value'].get('interpretedValue', None)
    else:
        # give the caller extra time for this response
        sessionAttributes['x-amz-lex:audio:end-timeout-ms:' + intent_name + ':StreetAddress'] = 2000
        response = helpers.elicit_slot_with_retries( intent, activeContexts, sessionAttributes, 'StreetAddress', requestAttributes)
        return response
    street_address = parse_address.parse(street_address)
    sessionAttributes['inputAddress'] = street_address

The last section of code above uses a helper function parse_address.parse() that converts spoken numbers into digits (for example, it converts “sixty eight hundred” to “6800”).

Then, we send the user’s utterance to Amazon Location Service and inspect the response. We discard any entries that don’t have a street, a street number, or have an incorrect zip code. In cases where we have to re-prompt for a street name or number, we also discard any previously suggested addresses.

# validate the address using the AWS Location Service
    location_response = locationClient.search_place_index_for_text(IndexName='explore.place', Text=street_address)
    # inspect the response from Amazon Location Service
    if location_response.get('Results', None) is not None:
        for address in location_response['Results']:
            if address.get('Place', None) is not None:
                addressLabel = address['Place'].get('Label', None)
                addressNumber = address['Place'].get('AddressNumber', None)
                street = address['Place'].get('Street', None)
                postalCode = address['Place'].get('PostalCode', None)
                if street is None:
                    continue                    
                if addressNumber is None:
                    continue                    
                if zip_code is not None:
                    if postalCode[:len(zip_code)] != zip_code:
                        continue
                already_tried = False
                prior_suggestions = helpers.get_all_values('suggested_address', sessionAttributes)
                for prior_suggestion in prior_suggestions:
                    if addressLabel == prior_suggestion:
                        already_tried = True
                        break                    
                if already_tried:
                    continue
                # the first entry with a valid street that was not already tried is the next best guess
                resolvedAddress = addressLabel
                break

Once we have a resolved address, we confirm it with the user.

if (event.get('inputMode') == 'Speech'):
        response_string = '<speak>OK, your new address is <say-as interpret-as="address">'
        response_string += resolvedAddress + '</say-as>. Is that right?</speak>'
        response_message = helpers.format_message_array(response_string, 'SSML')
    else:
       response_string = 'OK, your new address is ' + resolvedAddress + '. Is that right?'
        response_message = helpers.format_message_array(response_string, 'PlainText')
    intent['state'] = 'Fulfilled'
    response = helpers.confirm(intent, activeContexts, sessionAttributes, response_message, requestAttributes)
    return response

If we don’t get a resolved address back from the Amazon Location Service, or if the user says the address that we suggested wasn’t right, then we re-prompt for some additional information, and try again. The additional information slots include:

  • StreetName: slot type AMAZON.StreetName
  • SpelledStreetName: slot type AMAZON.AlphaNumeric (using Amazon Lex spelling styles)
  • StreetAddressNumber: slot type AMAZON.Number

The logic to re-prompt is controlled by the next_retry() function, which consults a list of actions to try:

RETRY_ACTIONS = [
    { "street_name": {
          "method": elicit_street_name,
          "style": None,
          "no-match": "Thank you. To make sure I get it right, can you tell me just the name of your street?",
          "incorrect": "Let's try again. Can you tell me just the name of your street?"
       }
    },
    { "street_name_spelled_by_letter": {
          "method": elicit_spelled_street, 
          "style": "SpellByLetter",
          "no-match": "Let's try a different way. Can you please spell just the name of your street?",
          "incorrect": "Let's try a different way. Can you please spell just the name of your street?"
       }
    },
    { "street_address_number": {
          "method": elicit_street_address_number, 
          "style": None,
          "no-match": "I didn't find a matching address. Can you please tell me your street address number?",
          "incorrect": "OK, let's try your street address number. Can you tell me that once more?"
       }
    },
    { "street_name_spelled_by_word": {
          "method": elicit_spelled_street, 
          "style": "SpellByWord",
          "no-match": "Let's try one last time. Please spell the name of your street. You can use words for letters, such as a as in apple, or b like bob.",
          "incorrect": "Let's try one last time. Please spell the name of your street. You can use words for letters, such as a as in apple, or b like bob."
       }
    },
    { "agent": {
          "method": route_to_agent, 
          "style": None,
          "no-match": "Sorry, I was not able to find a match for your address. Let me get you to an agent.",
          "incorrect": "Sorry, I was not able to find a match for your address. Let me get you to an agent."
       }
    }
]

The next_retry() function will try these actions in order. You can modify the sequence of prompts by changing the order in the RETRY_ACTIONS list. You can also configure different prompts for scenarios where Amazon Location Service doesn’t find a match, versus when the user says that the suggested address wasn’t correct. As you can see, we may ask the user to restate their street name, and failing that, to spell it using Amazon Lex spelling styles. We refer to this as a “say-spell” approach, and it’s similar to how a human agent would interact with a customer in this scenario.

To see this in action, you can deploy it in your AWS account.

Prerequisites

You can use the CloudFormation link that follows to deploy the solution in your own AWS account. Before deploying this solution, you should confirm that you have the following prerequisites:

  • An available AWS account where you can deploy the solution.
  • Access to the following AWS services:
    • Amazon Lex
    • AWS Lambda, for integration with Amazon Location Service
    • Amazon Location Service, for address lookup
    • AWS Identity and Access Management (IAM), for creating the necessary policies and roles
    • CloudWatch Logs, to create log groups for the Lambda function and optionally for capturing Lex conversation logs
    • CloudFormation to create the stack
  • An Amazon Connect instance (for instructions on setting one up, see Create an Amazon Connect instance).

The following AWS Regions support Amazon Lex, Amazon Connect, and Amazon Location Service: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Sydney) Region, and Asia Pacific (Tokyo).

Deploying the sample solution

Sign in to the AWS Management Console in your AWS account, and select the following link to deploy the sample solution:

This will create a new CloudFormation stack.

Enter a Stack name, such as lex-update-address-example. Enter the ARN (Amazon Resource Name) for the Amazon Connect instance that you’ll use for testing the solution. You can keep the default values for the other parameters, or change them to suit your needs. Choose Next, and add any tags that you may want for your stack (optional). Choose Next again, review the stack details, select the checkbox to acknowledge that IAM resources will be created, and then choose Create stack.

After a few minutes, your stack will be complete, and include the following resources:

  • A Lex bot, including a published version with an alias (Development-Alias)
  • A Lambda fulfillment function for the bot (BotHandler)
  • A CloudWatch Logs log group for Lex conversation logs
  • Required Amazon IAM roles
  • A custom resource that adds a sample contact flow to your Connect instance

At this point, you can try the example interaction above in the Lex V2 console. You should see the sample bot with the name that you specified in the CloudFormation template (e.g., update-address-bot).

Choose this bot, choose Bot versions in the left-side navigation panel, choose the Version 1 version, and then choose Intents in the left-side panel. You’ll see the list of intents, as well as a Test button.

To test, select the Test button, select Development-Alias, and then select Confirm to open the test window.

Try “I want to change my address” to get started. This will use the UpdateAddressZipFirst intent to capture an address, starting by asking for the zip code, and then asking for the street address.

You can also say “I want to update my address” to try the UpdateAddress intent, which captures an address all at once with a single utterance.

Testing with Amazon Connect

Now let’s try this with voice using a Connect instance. A sample contact flow was already configured in your Connect instance:

All you need to do is set up a phone number, and associate it with this contact flow. To do this, follow these steps:

  • Launch Amazon Connect in the AWS Console.
  • Open your Connect instance by selecting the Access URL, and logging in to the instance.
  • In Dashboard, select View phone numbers.
  • Select Claim a number, choose a country from the Country drop-down, and choose a number.
  • Enter a Description, such as “Example flow to update an address with Amazon Lex”, and select the contact flow that you just created.
  • Choose Save.

Now you’re ready to call in to your Connect instance to test your bot using voice. Just dial the number on your phone, and try some US addresses. To try the zip code first approach, say “change my address”. To try the change address in one turn approach, say “update my address”. You can also just say, “my new address is”, followed by a valid US address.

But wait… there’s more

Another challenging use case for voice scenarios is capturing a user’s email address. This is often needed for user verification purposes, or simply to let the user change their email address on file. Lex has built-in support for email addresses using the AMAZON.EmailAddress built-in slot type, which also supports Lex spelling styles.

Using a “say-spell” approach for capturing email addresses can be very effective, and since the approach is similar to the user experience in the street address capture scenarios that we described above, we’ve included it here. Give it a try!

Clean up

You may want to clean up the resources created as part of the CloudFormation template when you’re done using the bot to avoid incurring ongoing charges. To do this, delete the CloudFormation Stack.

Conclusion

Amazon Lex offers powerful automated speech recognition and natural language understanding capabilities that can be used to capture the information needed from your users to provide automated, self-service functionality. Capturing a customer’s address via speech recognition can be challenging due to the range of names for streets, cities, and towns. However, you can easily integrate Amazon Lex with the Amazon Location Service to look up the correct address, based on the customer’s input. You can incorporate this technique in your own Lex conversation flows.


About the Author

Brian Yost is a Senior Technical Program manager on the AWS Lex team. In his spare time, he enjoys mountain biking, home brewing, and tinkering with technology.