AWS Machine Learning Blog
Category: Amazon SageMaker Ground Truth
Real-time data labeling pipeline for ML workflows using Amazon SageMaker Ground Truth
High-quality machine learning (ML) models depend on accurately labeled, high-quality training, validation, and test data. As ML and deep learning models are increasingly integrated into production environments, it’s becoming more important than ever to have customizable, real-time data labeling pipelines that can continuously receive and process unlabeled data. For example, you may want to create […]
zomato digitizes menus using Amazon Textract and Amazon SageMaker
This post is co-written by Chiranjeev Ghai, ML Engineer at zomato. zomato is a global food-tech company based in India. Are you the kind of person who has very specific cravings? Maybe when the mood hits, you don’t want just any kind of Indian food—you want Chicken Chettinad with a side of paratha, and nothing […]
Processing auto insurance claims at scale using Amazon Rekognition Custom Labels and Amazon SageMaker Ground Truth
Computer vision uses machine learning (ML) to build applications that process images or videos. With Amazon Rekognition, you can use pre-trained computer vision models to identify objects, people, text, activities, or inappropriate content. Our customers have use cases that span every industry, including media, finance, manufacturing, sports, and technology. Some of these use cases require […]
Streamlining data labeling for YOLO object detection in Amazon SageMaker Ground Truth
Object detection is a common task in computer vision (CV), and the YOLOv3 model is state-of-the-art in terms of accuracy and speed. In transfer learning, you obtain a model trained on a large but generic dataset and retrain the model on your custom dataset. One of the most time-consuming parts in transfer learning is collecting […]
Setting up human review of your NLP-based entity recognition models with Amazon SageMaker Ground Truth, Amazon Comprehend, and Amazon A2I
Update Aug 12, 2020 – New features: Amazon Comprehend adds five new languages(Spanish, French, German, Italian and Portuguese) read here. Amazon Comprehend increased the limit of number of entities per custom entity model from 12 to 25 read here. Organizations across industries have a lot of unstructured data that you can evaluate to get entity-based […]
Building a custom Angular application for labeling jobs with Amazon SageMaker Ground Truth
As a data scientist attempting to solve a problem using supervised learning, you usually need a high-quality labeled dataset before starting your model building. Amazon SageMaker Ground Truth makes dataset building for a different range of tasks, like text classification and object detection, easier and more accessible to everyone. Ground Truth also helps you build […]
Developing NER models with Amazon SageMaker Ground Truth and Amazon Comprehend
Update October 2020: Amazon Comprehend now supports Amazon SageMaker GroundTruth to help label your datasets for Comprehend’s Custom Model training. For Custom EntityRecognizer, checkout Annotations documentation for more details. For Custom MultiClass and MultiLabel Classifier, checkout MultiClass and MultiLabel documentation for more details respectively. Named entity recognition (NER) involves sifting through text data to locate noun phrases […]
Labeling data for 3D object tracking and sensor fusion in Amazon SageMaker Ground Truth
Amazon SageMaker Ground Truth now supports labeling 3D point cloud data. For more information about the launched feature set, see this AWS News Blog post. In this blog post, we specifically cover how to perform the required data transformations of your 3D point cloud data to create a labeling job in SageMaker Ground Truth for […]
Bring your own model for Amazon SageMaker labeling workflows with active learning
With Amazon SageMaker Ground Truth, you can easily and inexpensively build accurately labeled machine learning (ML) datasets. To decrease labeling costs, SageMaker Ground Truth uses active learning to differentiate between data objects (like images or documents) that are difficult and easy to label. Difficult data objects are sent to human workers to be annotated and […]
Identifying worker labeling efficiency using Amazon SageMaker Ground Truth
A critical success factor in machine learning (ML) is the cleanliness and accuracy of training datesets. Training with mislabeled or inaccurate data can lead to a poorly performing model. But how can you easily determine if the labeling team is accurately labeling data? One way is to manually sift through the results one worker at […]