Posted On: Nov 1, 2022

Amazon Textract is a machine learning service that automatically extracts printed text, handwriting, and data from any document or image. AnalyzeExpense is a specialized API within Textract that understands the context of invoices and receipts and automatically extracts relevant data such as vendor name and invoice number. Today, we are pleased to announce major enhancements to AnalyzeExpense that include support for new fields and higher accuracy for existing fields.

The latest AnalyzeExpense API provides support for 40+ normalized fields. The newly supported normalized fields include both summary fields such as Vendor Address, and line item fields such as Product Code. With this new capability, customers can directly extract their desired information and save time from writing, and maintaining complex post- processing code. Besides support for new fields, we have further improved the accuracy for fields such as Vendor Name and Total that were already supported in the previous version.

Along with normalized key-value pairs and regular key value pairs, AnalyzeExpense now provides the entire OCR output in the API response. Customers can obtain both key-value pairs and the raw OCR extract through a single API request.

This update will be available in US East (Ohio, N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney), Canada (Central), Europe (Frankfurt, Ireland, London, Paris), and AWS GovCloud (US-East, US-West) starting October 31st.

To get started, log on to the Amazon Textract console to try out the new feature. To learn more about Textract capabilities, please visit the Amazon Textract website, developer guide, or resources page.