AWS Machine Learning Blog
Extracting handwritten information through Amazon Textract
This post is guest authored by Vibhav Sangam Gupta, a Client Solutions Partner at Quantiphi, Inc. Quantiphi is an award-winning applied AI and data science software and services company driven by the desire to solve transformational problems at the heart of business. We are passionate about our customers and obsessed with problem-solving to make products smarter, customer experiences frictionless, processes autonomous and businesses safer.
Over the past few years, businesses have experienced a steep rise in the volume of documents they have to deal with. These include a wide range of structured and unstructured text spread across different document formats. Processing these documents and extracting data is labor-intensive and costly. It involves complex operations that can easily go wrong, leading to regulatory breaches and hefty fines. Many digitally mature organizations, therefore, have started using intelligent document processing solutions.
Quantiphi has been a part of this transformation and witnessed higher adoption of QDox, our document processing solution built on top of Amazon Textract. It extracts information to gain business insights and automate downstream business processes. We have helped customers across insurance, healthcare, financial services, manufacturing automate loans processing, patient onboarding, and compliance management, to name a few.
Although these solutions have solved some crucial problems for businesses in reducing their manual efforts, extracting information from handwritten text has been a challenge. This is primarily because handwritten texts come with their own set of complexities, such as:
- Differences in handwriting styles
- Poor quality or illegible handwriting
- Joined or cursive handwritten text
- Compression or expansion of text
These challenges make it difficult to capture data correctly and gain meaningful insights for companies.
Use case: Insurance provider
Recently, one of our customers, a large supplemental insurance provider based in the US, was facing a similar challenge in extracting vital information from a doctor’s handwritten notes that accounted for 20% of the total documents. Initially, they manually sifted through the documents to decide on the claims payout and asked to automate the process, because it took 5–6 days to process the claim. As part of the process, we built a solution to extract printed and handwritten text from several supporting documents to verify the claim. To ease the process for policyholders, we built a user interface that could interact with users using a conversational agent, and the agent could request the necessary supporting documents to process the claim. The solution extracted information from the supporting documents, such as claim application, doctor notes, and invoices to validate the claim.
The following diagram illustrates the process flow.
The solution reduced manual intervention by over 70%, but extracting and validating information from a doctor’s handwritten note was still a task. The accuracy was low and required human intervention to validate the information, which impacted process efficiency.
Solution: Amazon Textract
As an AWS partner, we reached out to the Amazon Textract product team with a need to support handwriting recognition. They assured us they were developing a solution to address such challenges. When Amazon Textract came out with a beta version of the product for handwritten text, we were among the first to get private access. The Textract team worked closely with us and iterated quickly to improve the accuracy for a wide variety of documents. Below is an example of one of our documents that Textract recognized. In fact, our customers are also happy that it does even better than other handwriting recognition services we tested for them.
We used the Amazon Textract handwriting beta version with a few sample customer documents, and we saw it improved the accuracy of the entire process by over 90%, while reducing manual efforts significantly. This enabled us to expand the scope of our platform to additional offices of our client.
Armed with the success of our customers, we’re planning to implement the Amazon Textract handwriting solution into different processes across industries. As the product is set to launch, we believe that the implementation will become much easier and the results will improve considerably.
Summary
Overall, our partnership with AWS has helped us solve some challenging business problems to bring value to our customers. We plan to continue working with AWS to solve more challenging problems to bring real value to our customers.
There are many ways to get started with Amazon Textract: reach out to our AWS Partner Quantiphi, reach out to your account manager or solutions architect, or visit our Amazon Textract product page and learn more about the resources available.
About the Author
Vibhav Sangam Gupta is a Client Solutions Partner at Quantiphi, Inc, an Applied AI and Machine Learning software and services company focused on helping organizations translate the big promise of Big Data & Machine Learning technologies into quantifiable business impact.