AWS Machine Learning Blog

Create a translator chatbot using Amazon Translate and Amazon Lex

Powered by the same deep learning technologies as Amazon Alexa, Amazon Lex is a service for building conversational interfaces into any application that uses voice and text. Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. In this post, I show how you can use a custom slot with Amazon Lex to take free text as input, submit it to Amazon Translate for translation, and then present the result to the user. The solution is scalable, using Serverless Computing technologies and designed to allow secure, anonymous access to translation UI via Amazon Cognito. I’ll take advantage of an existing AWS CloudFormation script and Amazon CloudFront to easily create a web-based implementation of the translator bot.

Overview

In this post, I create an intent in Amazon Lex for a translation action. This intent prompts the user for a source and target language, for example English to Spanish, followed by a text string to translate. Users are free to switch languages at any time during the interaction with Amazon Lex. The solution in the following illustration makes full use of Serverless Computing technologies to enable seamless scaling to thousands of users without the need for further engineering effort.

Prerequisites

To follow the procedures in this blog post, you need the following:

  • An AWS account and access to a Region that supports Amazon Lex and Translate
  • Familiarity with Python for the AWS Lambda function
  • A training dataset of text

Step 1. Downloading and preprocessing your dataset

As a training dataset, I use the publicly available, open source text data from The Adventures of Sherlock Holmes by Arthur Conan Doyle. The data populates a custom slot in Lex that can accept free text input. You can download this text file in plaintext UTF-8 format and save it as sherlock.txt or you can use any training data for the custom slot if you want. The core requirement for preprocessing is to have many different examples of varying input text (e.g. 50+ slot values). There is a secondary requirement that each slot value can be up to 140 UTF-8 characters long. The rest of this post assumes you have used the Sherlock Holmes e-book.

From sherlock.txt, I extract sentences of a reasonable length, but fewer than 140 characters, to populate the slots in Lex.

I want to clean up the data and restrict the sample size to 100 randomly selected sentences of length shorter than 140 characters. (c.f. Lex, Slot Type Limits) I use the Python script below to create a zipped JSON file of the 100 slot entries. This script uses the Natural Language Toolkit library (NLTK) to tokenize sentences from the Sherlock Holmes e-book so that I can store each individual sentence as a separate enumeration value. You might need to install this library before you run the script. To install it on a Linux operating system:

sudo pip install -U nltk

From there…

# This script will take the Sherlock Holmes open source txt from the Open Source Project Gutenberg site (http://www.gutenberg.org/)
# Before running this script, please download the adventures of Sherlock Holmes in Plain Text UTF-8 format
# Source data location: www.gutenberg.org/ebooks/1661 "The Adventures of Sherlock Holmes by Arthur Conan Doyle"
# Script also uses the Natural Language Toolkit library (NLTK) available from https://www.nltk.org/

import re
import json
import zipfile 
import nltk

# Download punkt plugin for NLTK
nltk.download('punkt')

# Open source text data - saved as 'sherlock.txt'
# Use NLTK to put it into a list of sentences
with open('sherlock.txt', 'r') as trainingfile:
    trainingdata = trainingfile.read()
    sentences = nltk.sent_tokenize(trainingdata)


# Perform a regex on all sentences to pull out 'useful' sentences which are up to 140 characters in length 
# then strip out all non-alphanumeric characters
shortSentences = []
cleanSentences = []

# match the start of a sentence ('\A') and only alphanumeric symbols '\w' then continue until end
for text in sentences:    
    shortSentences += re.findall('\A\w.+$', text[0:141])

# remove all non alphanumeric characters that may appear
stripChar = '\"\'\.\?\,\!-()'
for s in shortSentences:
    cleanSentences.append(s.translate(str.maketrans("", "", stripChar)))

# Create a list containing only sentences which are greater than 5 words in length
slotEntries = []
nWords = 5
for e in cleanSentences:
    if (len(e.split()) > nWords):
        slotEntries.append(e)

# Create the zipped json file to import into Lex as the slot training data
lexSlotTextFile = 'trainingText.json'
lexSlotZipFile = 'trainingText.zip'
data = {
  "metadata": {
    "schemaVersion": "1.0",
    "importType": "LEX",
    "importFormat": "JSON"
  },
  "resource": {
    "description": "Training text input for custom free flow slot",
    "name": "trainingText",
    "version": "1",
    "enumerationValues": [],
    "valueSelectionStrategy": "ORIGINAL_VALUE"
  }
} 

# populate json with slot values
for slot in slotEntries:  
    data["resource"]["enumerationValues"].append({"value": slot})

# write the json file with the data
with open (lexSlotTextFile, 'w') as f:
    json.dump(data, f)
f.close()

# zip up file ready for import to Lex
with zipfile.ZipFile(lexSlotZipFile, 'w') as myzip:
    myzip.write(lexSlotTextFile)
myzip.close()

print ("Created json zip file for Lex Slot Import as", lexSlotZipFile)

Output data

When I run the Python script against the Sherlock Holmes e-book, it stores the clean data in a format ready for import to Lex as a new slot. This takes the form of a .zip file called trainingText.zip. It’s stored in the same directory as my original Python script. This .zip file is a compressed JSON file that can be directly imported into Lex to create the new slot with the enumeration values (all 100 of them!).

Step 2. Creating the Amazon Lex translator bot

In this step, I create the framework for Amazon Lex to receive text and process it via Amazon Translate, presenting the translation to the user.

In the AWS Console, I select the North Virginia region (us-east-1), and I then go to Amazon Lex and choose to create a Custom bot. I populate the bot details as follows, accepting the defaults for IAM role and then choose Create.

A note on Amazon Lex Intents, Utterances, Slots and Prompts

To build an Amazon Lex bot, you need to identify a set of actions—known as intents—that you want your bot to fulfill. A bot can have multiple intents. For example, a BookTickets bot can have intents to make reservations, cancel reservations, and review reservations.

An utterance is the spoken or typed phrase to invoke an intent. For example, to invoke the intent to make reservations, you would provide a sample utterance such as, “Can I make a reservation?”

To fulfill an intent, the Amazon Lex bot needs information from the user. This information is captured in slots. For example, you would define show name and time as slots for intent to make reservations.

Amazon Lex elicits the defined slots by using the prompts provided. For example, to elicit value for the slot time you define a prompt such as “What show time would you like to reserve?” Amazon Lex is capable of eliciting multiple slot values via a multi-turn conversation.

You can learn more about these features of Amazon Lex from the AWS Developer Guide at Amazon Lex: How It Works.

To get started with my translator bot, I have to first create an intent. This is the action that I want the bot to perform. Choose Create Intent, shown below, to get started and add a new intent via the Create intent option. I provide a name for the intent. To ensure the below Lambda code identifies the intent being called, I name my intent translate_phrase.

 

Next, I create an intent to capture the user request to translate. In this case I’m going to use several different ways that a user could invoke the translation intent.

Amazon Lex uses the above as a guide when determining the intent of a user. The curly brackets for {source_lang} and {target_lang} tell Amazon Lex to expect input in the form of Slots. In the above example, I have two slots to populate: source_lang and target_lang. These are used to later tell Amazon Translate the languages to translate between.

I need to create a Slot Type for the source and target languages. For an up-to-date list of supported languages, see What is Amazon Translate.

From the left-hand side menu, choose the + symbol beside Slot Types, then Create Slot Type. I use the parameters in the following screenshot with the slot type name set to desiredLanguage, and select the Slot Resolution option Expand Values.

I add the following values, one per slot:

  • arabic
  • simplified chinese
  • traditional chinese
  • czech
  • english
  • french
  • german
  • italian
  • japanese
  • portuguese
  • russian
  • spanish
  • turkish

After I populate the slot, I choose Add slot to intent.

I can now add slots to the intent along with a prompt to elicit input.

I create a slot called source_lang to capture the source language for the translation request and another slot called target_lang, with a prompt for the user to provide the target language to convert to. After I added these slots, the placeholders in my utterances are now highlighted. For example:

Creating the custom text input slot

By default, Amazon Lex expects values that are exactly or very similar to those entered as slot values. Unlike Alexa’s AMAZON.SearchQuery slot, there is no equivalent Amazon Lex slot option for more open text input. Instead, I overload the slot values of Amazon Lex to accept any input by providing a lot of slot values, much more than what you might normally assign for a slot. This does cause some limitations in terms of what additional slots the chatbot can respond to. Effectively, everything will match the custom slot after the source and target languages have been set so I use custom logic in the Lambda function instead.

To achieve this, I import a new slot type using the trainingText.zip file created from the Python script that was used to clean the Sherlock Holmes e-book. This creates a Lex slot called trainingText that is populated with the 100 enumeration values.

Now that the custom slot trainingText has been created, I’ll create a new required slot called phrase, which will accept the text that I want to translate.

To test whether the custom slot works, on the Amazon Lex Console, I choose Build. After the build was complete, I could test the interaction in the Test Bot window to ensure it was recognizing the correct slot inputs.

You can also choose Detail to view the JSON request. After I saw my custom text appear in the test chatbot, I knew I had the inputs working and I could then move onto validation and translation of the text via AWS Lambda and Amazon Translate.

Step 3. Creating the Lambda function to translate text

An AWS Lambda validation function is used to extract the slot input text via the chatbot, call the Amazon Translate API, and return the result to Lex. I took an existing AWS Lambda blueprint for Amazon Lex, lex-book-trip-python, to speed up the development process. The Lambda function code below is for validation of input via Amazon Lex. The actual code to integrate the translation is very small, with only a single line required to perform the translation via the Amazon Translate API call: translate_text

# -*- coding: utf-8 -*-
"""
This function interacts with Amazon Lex and the Amazon Translate AWS services to receive input speech 
and then translate it to the requested language. 
Current supported translations as of July 2018 are:

English to
    Arabic (ar)
    Chinese (Simplified) (zh)
    Chinese (Traditional) (zh-TW)
    Czech (cs)
    French (fr)
    German (de)
    Italian (it)
    Japanese (ja)
    Portuguese (pt)
    Russian (ru)
    Spanish (es)
    Turkish (tr)

Please see the Amazon Translate API docs for an up to date list: https://docs.aws.amazon.com/translate/latest/dg/API_TranslateText.html 
"""

from __future__ import print_function
import boto3
import logging


logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

# boto3 translate client
translate = boto3.client('translate')

# Use a dictionary to map supported languages with their ISO639-1 codes
lang_map = {
    'arabic': 'ar', 
    'simplified chinese': 'zh',
    'traditional chinese': 'zh-TW',
    'czech': 'cs',
    'english': 'en',
    'french': 'fr',
    'german': 'de',
    'italian': 'it',
    'japanese': 'ja',
    'portuguese': 'pt',
    'russian': 'ru',
    'spanish': 'es',
    'turkish': 'tr'
    }

# --------------- Helpers that build all of the responses ----------------------

def get_slots(intent_request):
    return intent_request['currentIntent']['slots']

def elicit_slot(session_attributes, intent_name, slots, slot_to_elicit, message):
    logger.debug('Elicit Slot function intentName: {}'.format(intent_name))
    logger.debug('Elicit Slot function slots: {}'.format(slots))
    logger.debug('Elicit Slot function slotToElicit: {}'.format(slot_to_elicit))
    logger.debug('Elicit Slot function message: {}'.format(message))
    
    
    return {
        'sessionAttributes': session_attributes,
        'dialogAction': {
            'type': 'ElicitSlot',
            'intentName': intent_name,
            'slots': slots,
            'slotToElicit': slot_to_elicit,
            'message': message
        }
    }


def confirm_intent(session_attributes, intent_name, slots, message):
    return {
        'sessionAttributes': session_attributes,
        'dialogAction': {
            'type': 'ConfirmIntent',
            'intentName': intent_name,
            'slots': slots,
            'message': message
        }
    }


def close(session_attributes, fulfillment_state, message):
    response = {
        'sessionAttributes': session_attributes,
        'dialogAction': {
            'type': 'Close',
            'fulfillmentState': fulfillment_state,
            'message': message
        }
    }

    return response


def delegate(session_attributes, slots):
    return {
        'sessionAttributes': session_attributes,
        'dialogAction': {
            'type': 'Delegate',
            'slots': slots
        }
    }

# --------------- Functions that control the skill's behavior ------------------
def try_ex(func):
    """
    Call passed in function in try block. If KeyError is encountered return None.
    This function is intended to be used to safely access dictionary.

    Note that this function would have negative impact on performance.
    """

    try:
        return func()
    except KeyError:
        return None

# return JSON-formatted descriptive response if validation fails
def build_validation_result(isvalid, violated_slot, message_content):
    return {
        'isValid': isvalid,
        'violatedSlot': violated_slot,
        'message': {'contentType': 'PlainText', 'content': message_content}
    }

# validate source and target languages 
def validate_languages(source_lang, target_lang):
    if source_lang is not None and source_lang not in lang_map:
        return build_validation_result(False,
                                       'source_lang',
                                       'We do not currently support {} as a source language.'.format(source_lang))

    if target_lang is not None and target_lang not in lang_map:
        return build_validation_result(False,
                                       'target_lang',
                                       'We do not currently support {} as a target language.'.format(source_lang))

    return build_validation_result(True, None, None)

# Check phrase is not empty
def validate_phrase(phrase, source_lang, target_lang):
    if phrase is not None: 
        logger.debug('***** phrase is not none *****')
        return build_validation_result(True, None, None)
    else:
        logger.debug('***** phrase is none *****')
        return build_validation_result(False,
                                       'phrase',
                                       'Translating from {} to {}. What is the phrase you would like to translate?'.format(source_lang.capitalize(), target_lang.capitalize()))

# --------------- Main Translator function ------------------

def translatePhrase(intent_request):
    """
    Performs dialog management and fulfillment for translating a phrase.

    Uses Amazon Translate to translate a phrase from a source -> destination phrase and provide to customer
    """

    source_lang = get_slots(intent_request)['source_lang'].lower()
    target_lang = get_slots(intent_request)['target_lang'].lower()
    phrase = get_slots(intent_request)['phrase']
    source = intent_request['invocationSource']

    logger.debug('Intent Request={}'.format(intent_request))
    logger.debug('source_lang={}'.format(source_lang))
    logger.debug('target_lang={}'.format(target_lang))


    logger.debug('Phrase={}'.format(phrase))
    logger.debug('inputTranscript={}'.format(intent_request['inputTranscript']))
    
    # Set session attrributes for the value of source and target languages
    session_attributes = intent_request['sessionAttributes'] if intent_request['sessionAttributes'] is not None else {}
    

    if source == 'DialogCodeHook':
        # Validate any slots which have been specified.  If any are invalid, re-elicit for their value
        slots = get_slots(intent_request)

        validation_result = validate_languages(source_lang, target_lang) #check languages are valid and supported
        if not validation_result['isValid']:
            slots[validation_result['violatedSlot']] = None
            return elicit_slot(intent_request['sessionAttributes'],
                               intent_request['currentIntent']['name'],
                               slots,
                               validation_result['violatedSlot'],
                               validation_result['message'])
        logger.debug('Validated Languages')        
        
        # Convert languages to ISO639-1 codes for Amazon Translate
        sourceISO = lang_map[source_lang]
        targetISO = lang_map[target_lang]

        logger.debug('sourceISO={}'.format(sourceISO))
        logger.debug('targetISO={}'.format(targetISO))
        #Set session attributes for the value of source and target ISO codes
        session_attributes['source_lang'] = sourceISO
        session_attributes['target_lang'] = targetISO

        phrase = get_slots(intent_request)['phrase']
        logger.debug('Phrase in dialogCodeHook code={}'.format(phrase))
        valid_phrase = validate_phrase(phrase, source_lang, target_lang)
        if not valid_phrase['isValid']:
            slots[valid_phrase['violatedSlot']] = None
            return elicit_slot(intent_request['sessionAttributes'],
                               intent_request['currentIntent']['name'],
                               slots,
                               valid_phrase['violatedSlot'],
                               valid_phrase['message'])
        logger.debug('Validated Phrase')
        
        # Phrase is valid. Convert phrase to translated text then reset slot to none for next translation
        if phrase is not None:
            translation = try_ex(lambda: translate.translate_text(Text=phrase, SourceLanguageCode=sourceISO, TargetLanguageCode=targetISO))                                                                                                                                        
            translatedText = str(translation.get('TranslatedText'))
            logger.debug('Translated Text is={}'.format(translatedText))
            print ("Translation is: ", translatedText) 
            
            intent_request['currentIntent']['phrase'] = None
            logger.debug('Set phrase slot to None and returning elicit slot with translation')
            logger.debug('Phrase={}'.format(intent_request['currentIntent']['phrase']))
            
            return elicit_slot(intent_request['sessionAttributes'],
                               intent_request['currentIntent']['name'],
                               slots,
                               'phrase',
                               {
                               'contentType': 'PlainText',
                               'content': format(translatedText)
                               })
        logger.debug('Translated Phrase')
        return delegate(session_attributes, intent_request['currentIntent']['slots'])
    
    # Convert phrase to translated text
    translation = try_ex(lambda: translate.translate_text(Text=phrase, SourceLanguageCode=sourceISO, TargetLanguageCode=targetISO))                                                                                                                                        
    translatedText = str(translation.get('TranslatedText'))
    
    logger.debug('Translated Text is={}'.format(translatedText))
    print ("Translation is: ", translatedText)
    

    return close(
        session_attributes,
        'Fulfilled',
        {
            'contentType': 'PlainText',
            'content': 'Thank you for using the translator bot'
        }
    )


# --- Lex intent capture ---


def dispatch(intent_request):
    """
    Called when the user specifies an intent for this bot.
    """

    logger.debug('dispatch userId={}, intentName={}'.format(intent_request['userId'], intent_request['currentIntent']['name']))

    intent_name = intent_request['currentIntent']['name']

    # Dispatch to your bot's intent handlers
    if intent_name == 'translate_phrase':
        return translatePhrase(intent_request)
    raise Exception('Intent with name ' + intent_name + ' not supported')


# --- Main lambda handler ---


def lambda_handler(event, context):
    """
    Route the incoming request based on intent.
    The JSON body of the request is provided in the event slot.
    """
    logger.debug('event.bot.name={}'.format(event['bot']['name']))

    return dispatch(event)

To integrate the Lambda code with Lex, I first create a new IAM role for the Lambda function that allows it to call the Amazon Translate APIs. I do this using the IAM Management Console, by creating a role called “LambdaFullwithTranslate” and attaching the following AWS managed policies:

  • AWSLambdaFullAccess
  • TranslateReadOnly

To create a Lambda function with the above code, I ensure I’m in the same region as my Amazon Lex bot (N. Virginia), and then create a new Lambda function called lexTranslate by selecting “Author from Scratch” and using the below as input parameters:

In the above screenshot, I ensure that existing role attached is the IAM role that I created previously to provide access to supporting Lambda services and Amazon Translate. To finish, I choose Create Function.

On the Lambda configuration page, I leave everything as their defaults and in the Function Code section, I select code entry type Edit code inline. I choose Python 2.7 as the runtime, and keep the default Handler name. I then paste in the above Python code into the Lambda_function.py window.

When ready, I choose Save to store the new Lambda function.

Adding the Lambda function to the Amazon Lex bot

To add the Lambda function to the Amazon Lex bot, I reference the name of the Lambda function (lexTranslate) for both the Lambda initialization and validation section, and again for the Fulfillment section.

I choose Build and try testing the Lex bot again in the test chatbot window to translate a phrase. I now see results for any text that I enter for translation.

 

Step 4. Amazon Lex web UI integration

For the user interface, I’ll use the Sample Amazon Lex Web Interface to create the interactive web-based UI for Lex. This is an easy way to rapidly add a customizable web-based front end to the Amazon Lex chat bot. The interface is deployed via a CloudFormation template and to make this easy, there is a Launch Stack button that will get you going in just one click.

By default, the CloudFormation template creates a sample Amazon Lex bot and an Amazon Cognito Identity Pool to get you started. It copies the chatbot UI web application to an Amazon S3 bucket including a dynamically created configuration file. The CloudFormation stack outputs links to the demo and related configuration after it’s deployed.

Deploying the CloudFormation stack

Starting on the GitHub readme page for the Amazon Lex web UI, choose Launch Stack. My browser opens the AWS CloudFormation page with the template field pre-populated. I verify that I’m in the same Region as my chatbot (N. Virginia)

  1. On Select Template, leave all as default and choose Next
  2. On Specify Details scroll to Lex Bot Configuration Parametersand enter the name of the Amazon Lex bot that I created earlier (translatorBot)
  3. Scroll further to Web Application Parameters to customize the text that appears when the bot is first started
    1. Set WebAppConfBotInitialText to: Welcome to the translator bot. You can ask me to translate between following languages: Arabic, Simplified Chinese, Traditional Chinese, Czech, English, French, German, Italian, Japanese, Portuguese, Russian, Spanish, and Turkish. To start please set the source and target languages. For example, to translate from English to French, enter: Translate from English to French
    2. Set WebAppConfBotInitialSpeech to: Welcome to the translator bot. To start please set the source and target languages. For example, to translate from English to French, say: Translate from English to French
    3. Set WebAppConfToolbarTitle to: Amazon Translator Bot
  4. Choose Next
  5. On Options leave defaults and choose Next
  6. On Review I ensure I’m referencing the correct Amazon Lex bot and I’m happy with the translation prompts, check the box to acknowledge the creation of IAM resources with custom names and then choose Create

After the CloudFormation script has completed, I select the Master CloudFormation stack, then choose the Outputs tab. Scroll down to the key named WebAppUrl,  the URL to my web chatbot is the value of this key. I can click this link and test out my new translator chatbot.

Conclusions

This post shows how to create a web-based chatbot that can perform translations. You are now able to build a chatbot that can take in any text outside of slot values and pass those values through to Amazon Translate. Future expansion of the bot could include use of DynamoDB to store commonly requested translations to reduce cost of translation. As usual, it is a good practice to test out the application and make sure that it works as expected before deploying for production. Now dive in and try this yourself!

If you have comments about this blog post, submit them in the Comments section below.

 


About the Author

Steve is an Enterprise Solution Architect at Amazon Web Services in the UK, with a special interest in machine learning, artificial intelligence, and deep learning. He helps enterprise and startup customers across a variety of sectors including fintech, travel & transport, retail and media build highly scalable, flexible and resilient cloud architectures.