Amazon Textract launches Layout feature to extract paragraphs, titles, and more from documents

Posted On: Sep 28, 2023

Amazon Textract is a machine learning service that automatically extracts printed text, handwriting, and data from any document or image. Today, we are pleased to announce Layout, a new Amazon Textract feature that enables customers to extract layout elements such as paragraphs, titles, lists, headers, footers, and more from documents. Layout will be a new feature type in the Analyze Document API. Customers can use Layout as a stand-alone feature or in combination with other Analyze Document feature types.

Layout is pre-trained on a wide variety of documents from the financial services, legal, insurance, medical, media and other industries. With Layout, customers will be able to directly extract layout elements from documents reducing their reliance on developing and maintaining complex post-processing code. In turn, we expect Layout to improve efficiencies for document processing operations such as creating search indices, embeddings for Retrieval Augmented Generation (RAG) applications, and more.

This feature will be available in US East (Ohio, N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney), Canada (Central), Europe (Frankfurt, Ireland, London, Paris), and AWS GovCloud (US-East, US-West) starting September 29th.

To get started, log on to the Amazon Textract console to try out the new feature. To learn more about Textract capabilities, please visit the Amazon Textract website, developer guide, or resources page.

Select your cookie preferences

Amazon Textract launches Layout feature to extract paragraphs, titles, and more from documents

Ending Support for Internet Explorer