Amazon Comprehend launches new trust and safety features

Posted on: Nov 9, 2023

Amazon Comprehend announced new features to help organizations enhance trust and safety for text-based content. Amazon Comprehend is a natural language processing (NLP) service that provides pre-trained and custom APIs to derive insights from text. With the new Toxicity Detection and Prompt Safety Classification features, customers can now apply guardrails to moderate user and machine generated content.

Today, organizations need to manage content generated by both generative AI applications and online users via chats, comments, and forum discussions. Users with malicious intent can either generate content or prompt generative AI models to create content that contains toxic language and sensitive data. To help ensure the safety of users, organizations need to quickly and intelligently moderate this content. Starting today, organizations can use Comprehend’s Toxicity Detection and Prompt Safety Classification features to moderate text-based content in a responsive, scalable, and cost-efficient manner.

Toxicity Detection is an NLP-powered capability that identifies toxic content by classifying text across seven categories including sexual harassment, hate speech, threat, abuse, profanity, insult, and graphic. The Prompt Safety Classifier flags unsafe prompts to prevent inappropriate use of generative AI applications. Both these APIs provide confidence scores which can be used to automatically redact inappropriate content using a set threshold, or to augment human moderation workflows.

Both APIs support English language, and are available in US East (N. Virginia), US West (Oregon), Europe (Ireland), and Asia Pacific (Sydney) regions. Customers can access these features using AWS CLI and SDKs. To get started, visit the Amazon Comprehend Trust and Safety documentation.

Amazon Comprehend launches new trust and safety features

Learn

Resources

Developers

Help