Amazon SageMaker Data Labeling

Create high-quality datasets for training machine learning models

Receive high-quality labeled data quickly

Choose your data labeling workforce

Increase visibility of data labeling operations

Generate high quality datasets to customize generative AI models

Amazon SageMaker enables you to label raw data, such as images, text files, and videos, and generate labeled synthetic data to create high-quality datasets for training machine learning (ML) models. SageMaker offers two options, Amazon SageMaker Ground Truth Plus and Amazon SageMaker Ground Truth, which provide you with the flexibility to use an expert workforce to create and manage data labeling workflows on your behalf or manage your own data labeling workflows.

Amazon SageMaker Ground Truth Plus

SageMaker Ground Truth Plus is a fully-managed service that allows you to create high-quality training datasets without having to build labeling applications or manage labeling workforces on your own. SageMaker Ground Truth Plus provides an expert workforce that is trained on ML tasks and can help meet your data security, privacy, and compliance requirements, while helping you reduce data labeling costs by up to 40%. You upload your data, and then SageMaker Ground Truth Plus creates and manages data labeling workflows and the workforce on your behalf.

SageMaker Ground Truth Plus can create high quality datasets to fine-tune foundation models for generative AI tasks, from answering questions to generating images and videos. It also allows skilled human workforces to review model outputs to ensure that they are aligned with human preferences. Additionally, SageMaker Ground Truth Plus enables application builders to customize models using their industry or company data to ensure their application represents their preferred voice and style.

Amazon SageMaker Ground Truth

If you want the flexibility to build and manage your own data labeling workflows and workforce, you can use SageMaker Ground Truth. SageMaker Ground Truth is a self-service offering that makes it easy to label data and gives you the option to use human annotators through Amazon Mechanical Turk, third-party vendors, or your own private workforce.

You can also generate labeled synthetic data without manually collecting or labeling real-world data. SageMaker Ground Truth can generate hundreds of thousands of automatically labeled synthetic images on your behalf.

How it works

  • Label data with SageMaker Ground Truth Plus
  • Amazon SageMaker Ground Truth Plus helps you to create high-quality training datasets without having to build labeling applications or manage a labeling workforce.

    How Amazon SageMaker Ground Truth Plus works
  • Label data with SageMaker Ground Truth
  • Amazon SageMaker Ground Truth helps you build and manage your own data labeling workflows and data labeling workforce.

    How Amazon SageMaker Ground Truth works
  • Generate labeled synthetic data
  • Amazon SageMaker Ground Truth helps you generate labeled synthetic data.

    Generate labeled synthetic data

Use cases

Support for Generative AI Applications

Create high quality datasets to fine-tune and customize foundation models.

Natural Language Processing

Classify text or label Named Entities (NER) with specific labels to generate your training dataset.

Learn more about natural language processing »

Computer Vision

Classify images and videos, perform semantic segmentation for highly detailed object recognition, and detect and track objects with a full suite of image and video annotation tools.

Learn more about computer vision »

3D LIDAR Navigation

Detect and track objects, and perform semantic segmentation for highly detailed object recognition within LIDAR 3D point cloud data.

Learn more about 3D LIDAR navigation »

How to get started

Get started with data labeling

Set up your own labeling workflow with SageMaker Ground Truth.

Learn how with a tutorial »

Access a data labeling workforce

Offload your labeling operations to AWS.

Connect with our team »

Learn more about SageMaker Ground Truth

Access additional resources, documentation and learning materials.

Visit getting started »