Posted On: Nov 2, 2022

Amazon Textract is a machine learning service that automatically extracts text, handwriting, and data from any document or image. We continuously improve the underlying machine learning models based on customer feedback to provide even better accuracy. Today, we are pleased to announce quality enhancements to our text and forms extraction feature available via the AnalyzeDocument API.

Amazon Textract now provides enhanced key-value pair extraction accuracy and more specifically for single character boxed forms commonly found in documents such as Tax, and Immigration forms. These documents have traditionally been challenging to extract information from due to their complexity in how the words are captured in boxes. Textract is now able to utilize its knowledge of these single character boxed forms to provide higher accuracies in key-value pair extraction.

Additionally, we are pleased to announce support for E13B fonts commonly found in deposit checks/cheques, accuracy improvements to detect International Bank Account Numbers found in banking documents, and long words (e.g., email addresses) via the AnalyzeDocument API. Customers across industries like insurance, healthcare, and banking utilize these documents in their business processes and will automatically see the benefits of this update when they use Textract’s Analyze Document API.

This update will be available in US East (Ohio, N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney), Canada (Central), Europe (Frankfurt, Ireland, London, Paris), and AWS GovCloud (US-East, US-West) Regions starting October 31st.

To get started, log on to the Amazon Textract console to try out the new feature. To learn more about Textract capabilities, please visit the Amazon Textract website, developer guide, or resources page.