Amazon SageMaker Ground Truth helps you build training datasets for machine learning. Ground Truth will label your content (images, audio, text, etc) by guiding a human labeler step-by-step in a process called a workflow. Three groups of humans can provide labels using these workflows: Amazon Mechanical Turk workers, your employees, or third party vendors. Ground Truth can also learn from these labels and label objects automatically.
You pay for each labeled object (which can be an image, an audio recording, a section of text, etc.) whether it’s labeled automatically by Ground Truth or by a human labeler. If you use a vendor or Mechanical Turk to provide labels, you pay an additional cost per labeled object. If you use your employees for labeling, there is no additional cost per labeled object.
You are charged for the number of dataset objects that are labeled. A dataset object is defined as an atomic unit of data and can include images, video frames, text documents, audio files, etc.
3D point clouds
Built-in workflow pricing for labeling with Amazon Mechanical Turk
As part of the AWS Free Tier, you can get started with Amazon SageMaker Ground Truth for free. For the first two months after the first use of Amazon SageMaker, your first 500 objects labeled per month are free (excluding any additional costs incurred by using a labeling vendor or Amazon Mechanical Turk).
Using internal employees for human-labeling
A manufacturing company uses machine learning to classify images of their products. To train their model, they label 40,000 images with product names. Using the built-in workflow for image classification, their employees label all 40,000 images.
Because the company used internal employees, the price for the 40,000 human-labeled images is the same $0.08 per image.
Total Cost = 40,000 human-labeled images x $0.08 per image = $3,200
Using Mechanical Turk for human-labeling with a custom workflow
An advertising company uses machine learning to determine both the sentiment and content of social media posts. To train their model, they decide that they need to label 85,000 posts. They decide to build and upload a custom workflow and set a payment of $0.036 per post. They also decide to have each post labeled 3 times to improve the accuracy of the labels. Using SageMaker Ground Truth humans label 85,000 posts.
Because the company used Mechanical Turk, the cost includes an additional charge of $0.036 for each human-labeled post to pay the labeler.
Total Cost = (50,000 x $0.08 per article) + (35,000 posts x $0.04 per post) + (85,000 human-labeled posts x $0.036 per post x 3 labelers per object) = $14,580
Using Mechanical Turk for human-labeling with a built-in workflow
A publishing company uses machine learning to build a natural language processing application to classify newspaper articles. To train their model, they label 200,000 articles. They select the built-in text classification workflow and decide to have each article labeled 3 times to improve the accuracy of the labels. Using SageMaker Ground Truth humans label 40,000 articles and 160,000 are labeled automatically.
Because the company used Mechanical Turk, the text classification workflow included an additional charge of $0.012 for each human-labeled article to pay the labeler.
Total Cost = (50,000 x $0.08 per article) + (150,000 articles x $0.04 per article) + (40,000 human-labeled articles x $0.012 per article x 3 labelers per object) + Amazon SageMaker training & inference costs** = $11,440 + Amazon SageMaker training & inference costs**
**These costs depend on a variety of factors, including the type of dataset being used, the type of labeling task, and the resolution of the images in your dataset.
Learn how Amazon SageMaker Ground Truth can help you build high-quality training datasets with the highest accuracy and reduce labeling data costs by up to 70%.
Instantly get access to the AWS Free Tier.
Get started building with Amazon SageMaker Ground Truth in the AWS Management Console.