Amazon SageMaker Ground Truth

Build highly accurate training datasets using machine learning and reduce data labeling costs by up to 70%

Amazon SageMaker Ground Truth helps you build highly accurate training datasets for machine learning quickly. SageMaker Ground Truth offers easy access to public and private human labelers and provides them with built-in workflows and interfaces for common labeling tasks. Additionally, SageMaker Ground Truth can lower your labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently.

Successful machine learning models are built on the shoulders of large volumes of high-quality training data. But, the process to create the training data necessary to build these models is often expensive, complicated, and time-consuming. The majority of models created today require a human to manually label data in a way that allows the model to learn how to make correct decisions. For example, building a computer vision system that is reliable enough to identify objects - such as traffic lights, stop signs, and pedestrians - requires thousands of hours of video recordings that consist of hundreds of millions of video frames. Each one of these frames needs all of the important elements like the road, other cars, and signage to be labeled by a human before any work can begin on the model you want to develop.

Amazon SageMaker Ground Truth significantly reduces the time and effort required to create datasets for training to reduce costs. These savings are achieved by using machine learning to automatically label data. The model is able to get progressively better over time by continuously learning from labels created by human labelers.

Where the labeling model has high confidence in its results based on what it has learned so far, it will automatically apply labels to the raw data. Where the labeling model has lower confidence in its results, it will pass the data to humans to do the labeling. The human-generated labels are provided back to the labeling model for it to learn from and improve. Over time, SageMaker Ground Truth can label more and more data automatically and substantially speed up the creation of training datasets. 

Benefits

Reduce data labeling costs by up to 70%

SageMaker Ground Truth uses a machine learning model to automatically label raw data to produce high-quality training datasets quickly at a fraction of the cost of manual labeling. Data is only routed to humans if the active learning model cannot confidently label it. The human-labeled data is then used to train the model to improve its capabilities. Less data is then sent to humans in the next round of labeling, lowering your costs. 

Work with public and private human labelers

You can choose to use your team of labelers and route labeling requests directly to them. Alternatively, if you need to scale up, options are provided directly in the Amazon SageMaker Ground Truth console to work with labelers outside of your organization. You can access a public workforce of over 500,000 labelers via integration with Amazon Mechanical Turk. Alternatively, if your data requires confidentiality or special skills, you can use professional labeling companies pre-screened by Amazon.

Achieve accurate results quickly

Amazon SageMaker Ground Truth helps build high-quality and accurate training datasets quickly. Machine-generated labels provide consistent results with a confidence score for each label so that you can easily understand how certain the service is that the label is correct. Human-labeled results are automatically scored against criteria you provide to help ensure that more data is sent to high-quality labelers and low-quality labelers are deemphasized.

How it works

Product-Page-Diagram_SamurAI_How-it-works-2
Product-Page_Standard-Icons_01_Product-Features_SqInk
Check out Amazon SageMaker Ground Truth features

Refer to the documentation to learn how Amazon SageMaker Ground Truth can help you build high-quality training datasets with the highest accuracy and reduce labeling data costs by up to 70%.

Product-Page_Standard-Icons_02_Sign-Up_SqInk
Sign up for a free account

Instantly get access to the AWS Free Tier. 

Sign up 
Product-Page_Standard-Icons_03_Start-Building_SqInk
Start building in the console

Get started building with Amazon SageMaker Ground Truth in the AWS Management Console.

Sign in