AWS Machine Learning Blog

Category: Amazon Textract

Enable Amazon Kendra search for a scanned or image-based text document

Amazon Kendra is an intelligent search service powered by machine learning (ML). Amazon Kendra reimagines search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization. Amazon Kendra supports a variety of document formats, […]

Build a traceable, custom, multi-format document parsing pipeline with Amazon Textract

Organizational forms serve as a primary business tool across industries—from financial services, to healthcare, and more. Consider, for example, tax filing forms in the tax management industry, where new forms come out each year with largely the same information. AWS customers across sectors need to process and store information in forms as part of their […]

Automate digitization of transactional documents with human oversight using Amazon Textract and Amazon A2I

In this post, we present a solution for digitizing transactional documents using Amazon Textract and incorporate a human review using Amazon Augmented AI (A2I). You can find the solution source at our GitHub repository. Organizations must frequently process scanned transactional documents with structured text so they can perform operations such as fraud detection or financial […]

Enable the visually impaired to hear documents using Amazon Textract and Amazon Polly

At the 2021 AWS re:Invent conference in Las Vegas, we demoed Read For Me at the AWS Builders Fair—a website that helps the visually impaired hear documents. For better quality, view the video here. Adaptive technology and accessibility features are often expensive, if they’re available at all. Audio books help the visually impaired read. Audio […]

Extract entities from insurance documents using Amazon Comprehend named entity recognition

Intelligent document processing (IDP) is a common use case for customers on AWS. You can utilize Amazon Comprehend and Amazon Textract for a variety of use cases ranging from document extraction, data classification, and entity extraction. One specific industry that uses IDP is insurance. They use IDP to automate data extraction for common use cases such as claims intake, […]

Announcing support for extracting data from identity documents using Amazon Textract

Creating efficiencies in your business is at the top of your list. You want your employees to be more productive, have them focus on high impact tasks, or find ways to implement better processes to improve the outcomes to your customers. There are various ways to solve this problem, and more companies are turning to […]

Postprocessing with Amazon Textract: Multi-page table handling

Amazon Textract is a machine learning (ML) service that automatically extracts printed text, handwriting, and other data from scanned documents that goes beyond simple optical character recognition (OCR) to identify and extract data from forms and tables. Currently, thousands of customers are using Amazon Textract to process different types of documents. Many include tables across […]

Augment search with metadata by chaining Amazon Textract, Amazon Comprehend, and Amazon Kendra

Amazon Kendra is an intelligent search service powered by machine learning (ML). Amazon Kendra reimagines enterprise search for your websites and applications so your employees and customers can easily find the content they’re looking for, even when it’s scattered across multiple locations and content repositories within your organization. With Amazon Kendra, you can stop searching […]

Intelligently split multi-form document packages with Amazon Textract and Amazon Comprehend

Many organizations spanning different sizes and industry verticals still rely on large volumes of documents to run their day-to-day operations. To solve this business challenge, customers are using intelligent document processing services from AWS such as Amazon Textract and Amazon Comprehend to help with extraction and process automation. Before you can extract text, key-value pairs, […]

Bring structure to diverse documents with Amazon Textract and transformer-based models on Amazon SageMaker

From application forms, to identity documents, recent utility bills, and bank statements, many business processes today still rely on exchanging and analyzing human-readable documents—particularly in industries like financial services and law. In this post, we show how you can use Amazon SageMaker, an end-to-end platform for machine learning (ML), to automate especially challenging document analysis […]