AWS Comprehend lowers annotation limits for training custom entity recognition models

Posted on: Aug 3, 2022

Amazon Comprehend is making it easier for customers to get started with custom entity recognition by reducing the annotation requirements for training their models. Amazon Comprehend is a natural language processing (NLP) service that provides APIs to extract key phrases, contextual entities, events, and sentiment from text. Entities refer to things in your document such as people, places, organizations, credit card numbers, and so on. Custom entity recognition (CER) in Amazon Comprehend enables you to train models with entities unique to your business in just a few easy steps. You can identify almost any kind of entity, simply by providing a sufficient number of details to train your model effectively.

Until today, you had to train an Amazon Comprehend custom entity recognizer with a minimum of 250 documents and 100 annotations per entity. Starting today, we are reducing the minimum requirements to train an Amazon Comprehend custom entity recognition model to 25 annotations per entity type. With our improved modeling behind the scenes, you can now start running your experiments with as low as 3 annotated documents, analyze preliminary results, and iterate by including additional annotations and documents. The reduced limits apply to the custom entity recognition models for plain-text documents only.

To learn more and get started, visit the Amazon Comprehend launch blog post, product page, or our documentation.

AWS Comprehend lowers annotation limits for training custom entity recognition models

Learn

Resources

Developers

Help