AWS Machine Learning Blog

Category: Amazon SageMaker Ground Truth

Use a data-centric approach to minimize the amount of data required to train Amazon SageMaker models

As machine learning (ML) models have improved, data scientists, ML engineers and researchers have shifted more of their attention to defining and bettering data quality. This has led to the emergence of a data-centric approach to ML and various techniques to improve model performance by focusing on data requirements. Applying these techniques allows ML practitioners […]

Get to production-grade data faster by using new built-in interfaces with Amazon SageMaker Ground Truth Plus

Launched at AWS re:Invent 2021, Amazon SageMaker Ground Truth Plus helps you create high-quality training datasets by removing the undifferentiated heavy lifting associated with building data labeling applications and managing the labeling workforce. All you do is share data along with labeling requirements, and Ground Truth Plus sets up and manages your data labeling workflow […]

Image augmentation pipeline for Amazon Lookout for Vision

Amazon Lookout for Vision provides a machine learning (ML)-based anomaly detection service to identify normal images (i.e., images of objects without defects) vs anomalous images (i.e., images of objects with defects), types of anomalies (e.g., missing piece), and the location of these anomalies. Therefore, Lookout for Vision is popular among customers that look for automated […]

Create high-quality data for ML models with Amazon SageMaker Ground Truth

Machine learning (ML) has improved business across industries in recent years—from the recommendation system on your Prime Video account, to document summarization and efficient search with Alexa’s voice assistance. However, the question remains of how to incorporate this technology into your business. Unlike traditional rule-based methods, ML automatically infers patterns from data so as to […]

Identify rooftop solar panels from satellite imagery using Amazon Rekognition Custom Labels

Renewable resources like sunlight provide a sustainable and carbon neutral mechanism to generate power. Governments in many countries are providing incentives and subsidies to households to install solar panels as part of small-scale renewable energy schemes. This has created a huge demand for solar panels. Reaching out to potential customers at the right time, through […]

LiDAR 3D point cloud labeling with Velodyne LiDAR sensor in Amazon SageMaker Ground Truth

LiDAR is a key enabling technology in growing autonomous markets, such as robotics, industrial, infrastructure, and automotive. LiDAR delivers precise 3D data about its environment in real time to provide “vision” for autonomous solutions. For autonomous vehicles (AVs), nearly every carmaker uses LiDAR to augment camera and radar systems for a comprehensive perception stack capable […]

Inspect your data labels with a visual, no code tool to create high-quality training datasets with Amazon SageMaker Ground Truth Plus

Launched at AWS re:Invent 2021, Amazon SageMaker Ground Truth Plus helps you create high-quality training datasets by removing the undifferentiated heavy lifting associated with building data labeling applications and managing the labeling workforce. All you do is share data along with labeling requirements, and Ground Truth Plus sets up and manages your data labeling workflow […]

Build a custom Q&A dataset using Amazon SageMaker Ground Truth to train a Hugging Face Q&A NLU model

In recent years, natural language understanding (NLU) has increasingly found business value, fueled by model improvements as well as the scalability and cost-efficiency of cloud-based infrastructure. Specifically, the Transformer deep learning architecture, often implemented in the form of BERT models, has been highly successful, but training, fine-tuning, and optimizing these models has proven to be […]

Build an MLOps sentiment analysis pipeline using Amazon SageMaker Ground Truth and Databricks MLflow

As more organizations move to machine learning (ML) to drive deeper insights, two key stumbling blocks they run into are labeling and lifecycle management. Labeling is the identification of data and adding labels to provide context so an ML model can learn from it. Labels might indicate a phrase in an audio file, a car […]

Label text for aspect-based sentiment analysis using SageMaker Ground Truth

This blog post was last reviewed and updated August, 2022 with revised sample document links. The Amazon Machine Learning Solutions Lab (MLSL) recently created a tool for annotating text with named-entity recognition (NER) and relationship labels using Amazon SageMaker Ground Truth. Annotators use this tool to label text with named entities and link their relationships, thereby […]