AWS Machine Learning Blog

Category: Amazon Comprehend

How Reveal’s Logikcull used Amazon Comprehend to detect and redact PII from legal documents at scale

Today, personally identifiable information (PII) is everywhere. PII is in emails, slack messages, videos, PDFs, and so on. It refers to any data or information that can be used to identify a specific individual. PII is sensitive in nature and includes various types of personal data, such as name, contact information, identification numbers, financial information, […]

Automatically redact PII for machine learning using Amazon SageMaker Data Wrangler

Customers increasingly want to use deep learning approaches such as large language models (LLMs) to automate the extraction of data and insights. For many industries, data that is useful for machine learning (ML) may contain personally identifiable information (PII). To ensure customer privacy and maintain regulatory compliance while training, fine-tuning, and using deep learning models, […]

Improve prediction quality in custom classification models with Amazon Comprehend

In this post, we explain how to build and optimize a custom classification model using Amazon Comprehend. We demonstrate this using an Amazon Comprehend custom classification to build a multi-label custom classification model, and provide guidelines on how to prepare the training dataset and tune the model to meet performance metrics such as accuracy, precision, recall, and F1 score.

Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets

Multi-modal data is a valuable component of the financial industry, encompassing market, economic, customer, news and social media, and risk data. Financial organizations generate, collect, and use this data to gain insights into financial operations, make better decisions, and improve performance. However, there are challenges associated with multi-modal data due to the complexity and lack […]

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

In first part of this multi-series blog post, you will learn how to create a scalable training pipeline and prepare training data for Comprehend Custom Classification models. We will introduce a custom classifier training pipeline that can be deployed in your AWS account with few clicks.

Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight

Searching for insights in a repository of free-form text documents can be like finding a needle in a haystack. A traditional approach might be to use word counting or other basic analysis to parse documents, but with the power of Amazon AI and machine learning (ML) tools, we can gather deeper understanding of the content. […]

Semantic image search for articles using Amazon Rekognition, Amazon SageMaker foundation models, and Amazon OpenSearch Service

Digital publishers are continuously looking for ways to streamline and automate their media workflows in order to generate and publish new content as rapidly as they can. Publishers can have repositories containing millions of images and in order to save money, they need to be able to reuse these images across articles. Finding the image that best matches an article in repositories of this scale can be a time-consuming, repetitive, manual task that can be automated. It also relies on the images in the repository being tagged correctly, which can also be automated (for a customer success story, refer to Aller Media Finds Success with KeyCore and AWS). In this post, we demonstrate how to use Amazon Rekognition, Amazon SageMaker JumpStart, and Amazon OpenSearch Service to solve this business problem.

Intelligent Document Processing Pipeline with Generative AI

Enhancing AWS intelligent document processing with generative AI

Data classification, extraction, and analysis can be challenging for organizations that deal with volumes of documents. Traditional document processing solutions are manual, expensive, error prone, and difficult to scale. AWS intelligent document processing (IDP), with AI services such as Amazon Textract, allows you to take advantage of industry-leading machine learning (ML) technology to quickly and […]

Safe image generation and diffusion models with Amazon AI content moderation services

Generative AI technology is improving rapidly, and it’s now possible to generate text and images based on text input. Stable Diffusion is a text-to-image model that empowers you to create photorealistic applications. You can easily generate images from text using Stable Diffusion models through Amazon SageMaker JumpStart. The following are examples of input texts and […]

Identify objections in customer conversations using Amazon Comprehend to enhance customer experience without ML expertise

According to a PWC report, 32% of retail customers churn after one negative experience, and 73% of customers say that customer experience influences their purchase decisions. In the global retail industry, pre- and post-sales support are both important aspects of customer care. Numerous methods, including email, live chat, bots, and phone calls, are used to […]