Amazon Comprehend now supports Amazon Sagemaker GroundTruth training datasets for custom model training

Posted on: Sep 22, 2020

You can now train Custom Named Entity Recognition and Custom Classification models in Amazon Comprehend using training datasets from Amazon Sagemaker GroundTruth. You can use Comprehend’s Custom Named Entity Recognition to identify terms that are specific to your industry or organization. For example, you can instantly extract product names, financial entities or any term relevant to you from text data. Similarly, you can use Comprehend’s Custom Classification to assign categories relevant to your use case to text data.

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. It provides pre-trained models for recognizing entities, key phrases, sentiments, and other common elements in a document. You can also build custom models with Amazon Comprehend to recognize custom entities and classify documents.

Amazon SageMaker Ground Truth helps you build highly accurate training datasets for custom Comprehend models quickly. Using SageMaker Ground Truth, you can easily send labeling jobs to your own labelers or you can access a workforce of over 500,000 independent contractors who are already performing machine learning related tasks through Amazon Mechanical Turk. If your data requires confidentiality or special skills, you can use vendors pre-screened by AWS for quality and security procedures, including iVision, CapeStart Inc., Cogito, and iMerit. Using AutoML, Comprehend will learn from the training dataset, and then train a private, custom model. No machine learning experience required.

Amazon Comprehend’s support for Sagemaker GroundTruth is available in all AWS regions where Amazon Comprehend is available. To try the new feature, log in to the Amazon Comprehend console for a code-free experience, or download the AWS SDK.