Posted On: May 29, 2019
We are excited to announce the general availability of Amazon Textract, which has been in preview since re:invent 2018. Amazon Textract is a managed machine learning service that automatically extracts text and structured data from virtually any document. Customers use Amazon Textract to quickly automate document workflows, processing millions of document pages in a few hours.
Amazon Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms, information stored in tables, and the context in which the information is presented. Amazon Textract’s API supports multiple image formats like scans, PDFs, and photos, and customers can use it with other AWS machine learning services like Amazon Comprehend, Amazon Comprehend Medical, and Amazon Translate to derive deeper meaning from the extracted text and data. The extracted text and data can also be used to build smart searches on large archives of documents, or it can be loaded into a database for use by applications, such as accounting, auditing, and compliance software. To learn more about Amazon Textract, please visit the Amazon Textract website.