AWS Machine Learning Blog

Category: Amazon Textract

Deploying and using the Document Understanding Solution

Based on our day to day experience, the information we consume is entirely digital. We read the news on our mobile devices far more than we do from printed copy newspapers. Tickets for sporting events, music concerts, and airline travel are stored in apps on our phones. One could go weeks or longer without needing […]

This month in AWS Machine Learning: October edition

Every day there is something new going on in the world of AWS Machine Learning—from launches to new to use cases to interactive trainings. We’re packaging some of the not-to-miss information from the ML Blog and beyond for easy perusing each month. Check back at the end of each month for the latest roundup. Launches […]

zomato digitizes menus using Amazon Textract and Amazon SageMaker

This post is co-written by Chiranjeev Ghai, ML Engineer at zomato. zomato is a global food-tech company based in India. Are you the kind of person who has very specific cravings? Maybe when the mood hits, you don’t want just any kind of Indian food—you want Chicken Chettinad with a side of paratha, and nothing […]

Building an end-to-end intelligent document processing solution using AWS

July 2023: This post was reviewed and updated for accuracy. The AWS CloudFormation template was updated. As organizations grow larger in size, so does the need for having better document processing. In industries such as healthcare, legal, insurance, and banking, the continuous influx of paper-based or PDF documents (like invoices, health charts, and insurance claims) […]

Improved OCR and structured data extraction with Amazon Textract

Optical character recognition (OCR) technology, which enables extracting text from an image, has been around since the mid-20th century, and continues to be a research topic today. OCR and document understanding are still vibrant areas of research because they’re both valuable and hard problems to solve. AWS has been investing in improving OCR and document […]

How Kabbage improved the PPP lending experience with Amazon Textract

This is a guest post by Anthony Sabelli, Head of Data Science at Kabbage, a data and technology company providing small business cash flow solutions. Kabbage is a data and technology company providing small business cash flow solutions. One way in which we serve our customers is by providing them access to flexible lines of […]

Translating PDF documents using Amazon Translate and Amazon Textract

September 2024: This post was reviewed and updated for accuracy. In 1993, the Portable Document Format or the PDF was born and released to the world. Since then, companies across various industries have been creating, scanning, and storing large volumes of documents in this digital format. These documents and the content within them are vital […]

Using Amazon Textract with AWS PrivateLink

Amazon Textract now supports Amazon Virtual Private Cloud (Amazon VPC) endpoints via AWS PrivateLink so you can securely initiate API calls to Amazon Textract from within your VPC and avoid using the public internet. In this post, we show you how to access Amazon Textract APIs from within your VPC without traversing the public internet, […]

Amazon Textract now available in Asia Pacific (Mumbai) and EU (Frankfurt) Regions 

You can now use Amazon Textract, a machine learning (ML) service that quickly and easily extracts text and data from forms and tables in scanned documents, for workloads in the AWS Asia Pacific (Mumbai) and EU (Frankfurt) Regions. Amazon Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms, […]

Processing PDF documents with a human loop using Amazon Textract and Amazon Augmented AI

Businesses across many industries, including financial, medical, legal, and real estate, process a large number of documents for different business operations. Healthcare and life science organizations, for example, need to access data within medical records and forms to fulfill medical claims and streamline administrative processes. Amazon Textract is a machine learning (ML) service that makes […]