AWS Machine Learning Blog

Category: Amazon Comprehend

Building an end-to-end intelligent document processing solution using AWS

As organizations grow larger in size, so does the need for having better document processing. In industries such as healthcare, legal, insurance, and banking, the continuous influx of paper-based or PDF documents (like invoices, health charts, and insurance claims) have pushed businesses to consider evolving their document processing capabilities. In such scenarios, businesses and organizations […]

Read More
Architecture Diagram for Feedback Loops

Active learning workflow for Amazon Comprehend custom classification models – Part 1

Amazon Comprehend  Custom Classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. For example, your customer support organization can use Custom Classification to automatically categorize inbound requests by problem type based on how the customer has described the issue.  You can use custom classifiers to automatically label […]

Read More

Detecting and redacting PII using Amazon Comprehend

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning (ML) to find insights and relationships like people, places, sentiments, and topics in unstructured text. You can now use Amazon Comprehend ML capabilities to detect and redact personally identifiable information (PII) in customer emails, support tickets, product reviews, social media, and more. […]

Read More

Securing Amazon Comprehend API calls with AWS PrivateLink

Amazon Comprehend now supports Amazon Virtual Private Cloud (Amazon VPC) endpoints via AWS PrivateLink so you can securely initiate API calls to Amazon Comprehend from within your VPC and avoid using the public internet. Amazon Comprehend is a fully managed natural language processing (NLP) service that uses machine learning (ML) to find meaning and insights […]

Read More

Setting up human review of your NLP-based entity recognition models with Amazon SageMaker Ground Truth, Amazon Comprehend, and Amazon A2I

Update Aug 12, 2020 – New features: Amazon Comprehend adds five new languages(Spanish, French, German, Italian and Portuguese) read here. Amazon Comprehend increased the limit of number of entities per custom entity model from 12 to 25 read here. Organizations across industries have a lot of unstructured data that you can evaluate to get entity-based […]

Read More

Extracting custom entities from documents with Amazon Textract and Amazon Comprehend

Amazon Textract is a machine learning (ML) service that makes it easy to extract text and data from scanned documents. Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms and information stored in tables. This allows you to use Amazon Textract to instantly “read” virtually any type of […]

Read More

Query drug adverse effects and recalls based on natural language using Amazon Comprehend Medical

In this post, we demonstrate how to use Amazon Comprehend Medical to extract medication names and medical conditions to monitor drug safety and adverse events. Amazon Comprehend Medical is a natural language processing (NLP) service that uses machine learning (ML) to easily extract relevant medical information from unstructured text. We query the OpenFDA API (an open-source API published by […]

Read More

Announcing the launch of Amazon Comprehend custom entity recognition real-time endpoints

Update Sep 28, 2020 – New features: Amazon Comprehend custom entity recognition real-time endpoints now supports application auto scaling. Please refer to the section Auto Scaling with real-time endpoints in this post to learn more. Update Aug 12, 2020 – New features: Amazon Comprehend adds five new languages(Spanish, French, German, Italian and Portuguese) read here. Amazon […]

Read More

Deriving conversational insights from invoices with Amazon Textract, Amazon Comprehend, and Amazon Lex

Organizations across industries have a large number of physical documents such as invoices that they need to process. It is difficult to extract information from a scanned document when it contains tables, forms, paragraphs, and check boxes. Organization have been addressing these problems with manual effort or custom code or by using Optical Character Recognition […]

Read More

Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend

Update October 2020: Amazon Comprehend now supports Amazon SageMaker GroundTruth to help label your datasets for Comprehend’s Custom Model training. For Custom EntityRecognizer, checkout Annotations documentation for more details. For Custom MultiClass and MultiLabel Classifier, checkout MultiClass and MultiLabel documentation for more details respectively. Named entity recognition (NER) involves sifting through text data to locate noun phrases […]

Read More