AWS Marketplace

Automating Know Your Customer workflow using a ready-to-use model package from AWS Marketplace for Machine Learning

by Nhat (Jeff) Hoang, VP of Product Design, GTRIIP and John-Michael Floyd, Business Development Manager, AWS Marketplace

AWS Marketplace for Machine Learning makes it easy for AWS customers to automate business decisions with hundreds of curated, performance-optimized machine learning algorithms and model packages. You can discover machine learning solutions from over 50 categories serving industries such as finance, healthcare, manufacturing, media, construction, and more.

In this blog post, we will demonstrate how to leverage machine learning to deliver efficiencies in one aspect of the travel and hospitality compliance process. In these industries, Know Your Customer (KYC) is a common compliance procedure, where companies are required to capture specific user information during airline or hotel check-in. We will show you how to automate and scale the KYC process using GTRIIP’s Passport data page validation model package from AWS Marketplace for Machine Learning.

This model enables you to validate a user’s passport image and extract that user’s information from the image for your own data capture verification process. Here is a step-by-step guide for how to subscribe and use the model.

Step 1: Find a suitable model package in AWS Marketplace

As shown in the image below, you can discover machine learning algorithms and models packages in AWS Marketplace by selecting Amazon SageMaker under “Delivery Methods.”

Then type “passport data page validation” in the search box at the top of the page to find the GTRIIP passport data validation model package page, as show in the screen shot below.

Step 2: Subscribe to GTRIIP’s Passport Data Page Detection model package

Select any of the available products to go to the product detail page, where you can learn more about the product, read usage instructions, and compare pricing. From the product detail page of the product you wish to purchase, click Continue to Subscribe, as shown in the image below. After this step is done, you can complete the subscription process by reviewing the End User License Agreement (EULA), pricing, and clicking on the “Accept Offer” button to complete the subscription process.

Step 3: Deploy your product on Amazon SageMaker Now you can deploy the selected product on Amazon SageMaker. You can use the notebook provided by GTRIIP or use the Amazon SageMaker console to create an endpoint for your model package.

Step 4: Prepare the image for processing
GTRIIP’s Passport Data Page Detection model package allows you to validate a user’s passport image and extract that user’s information from the passport image. To ensure the best results, make sure to follow the input guidelines below:

  1. Image size: minimum of 800 x 600 pixels. The size of the passport page inside your image must be at least 800 x 600 pixels. This is to ensure that text data is visible and clear enough for the engine to recognize and extract the information from.
  2. Passport must be the largest object in the image. Users can take a photo of their own passport in various ways, especially when using a phone camera. This is to ensure that the engine can recognize and extract the passport page image from the background before doing any further image processing to validate the data.
  3. Clear background, and the passport image should be clearly visible. For the best results, the passport image should be taken with a clear background and no blocking or reflection on the passport page area.

Step 5: Send the request to your model package deployed on Amazon SageMaker

After preparing your inputs, you can now send them to the Amazon SageMaker endpoint to analyze and extract user information from the images. Here we are invoking the “passport-validation” endpoint using the AWS Python SDK and displaying the result:

import json

import numpy as np

 

with open(file_name, 'rb') as f:

    payload = f.read()

    payload = bytearray(payload)

response = runtime.invoke_endpoint(EndpointName=endpoint_name,

                                   ContentType='application/x-image',

                                   Body=payload)

result = response['Body'].read()

print (result)

 

The result has the following format:

{

"valid": true,

"reason": "",

"data": {

"country": "China",

"number": "KOOOOOOOO",

"date_of_birth": "08/08/1980",

"expiration_date": "05/02/2017",

"nationality": "CHN",

"sex": "Female",

"names": "KWOK SUM",

"surname": "CHUNG"

}

}

In the above result, the product successfully extracted the country, date of birth, expiration date, nationality, sex, name, and surname from the passport image. With the data extracted, you can apply this model to different use cases of your businesses, wherever you need to quickly and easily validate a new user’s identity.

Conclusion

In this post, I showed you how to use GTRIIP’s Passport Data Page Detection model package, available in AWS Marketplace for Machine Learning, to automate image classification and data extraction.

Next steps

To find out more, browse GTRIIP’s products in AWS Marketplace, read the buyer’s guide for AWS Marketplace for Machine Learning, ask a question in the AWS Marketplace forum, or contact the AWS Marketplace team if you cannot find a solution that fits your needs!