Training a custom single class object detection model with Amazon Rekognition Custom Labels

Customers often need to analyze their images to find objects that are unique to their business needs. In many cases, this may be a single object, like identifying the company’s logo, finding a particular industrial or agricultural defect, or locating a specific event like a hurricane in satellite scans. In this post, we showcase how to train a custom model to detect a single object using Amazon Rekognition Custom Labels.

Amazon Rekognition is a fully managed service that provides computer vision (CV) capabilities for analyzing images and video at scale, using deep learning technology without requiring machine learning (ML) expertise. Amazon Rekognition Custom Labels, an automated machine learning (ML) feature of Amazon Rekognition, lets you quickly train a custom CV models specific to your business needs, simply by bringing labeled images. With the latest update to support single object training, Amazon Rekognition Custom Labels now lets you create a custom object detection model with single object classes.

Solution overview

To show you how the single class object detection feature works, let us create a custom model to detect pizzas. With this new feature, we don’t need to create a second label “not pizza” or other food types.

To create our custom model, we follow these steps:

Create a project in Amazon Rekognition Custom Labels.
Create a dataset with images containing one or more pizzas.
Label the images by applying bounding boxes on all pizzas in the images using the user interface provided by Amazon Rekognition Custom Labels.
Train the model and evaluate the performance.
Test the new custom model using the automatically generated API endpoint.

Amazon Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end process.

Creating your project

To create your pizza-detection project, complete the following steps:

On the Amazon Rekognition console, choose Custom Labels.
Choose Get Started.
For Project name, enter PizzaDetection.
Choose Create project

You can also create a project on the Projects page. You can access the Projects page via the left navigation pane.

Creating your dataset

To create your pizza model, you first need to create a dataset to train the model with. For this post, our dataset is composed of 39 images that contain pizza. We sourced our images from pexels.com.

To create your dataset:

Choose Create dataset.
Select Upload images from your computer.

Choose Add Images.
Upload your images. You can always add more images later.

Labeling the images with bounding boxes

You’re now ready to label the images by applying bounding boxes on all images with pizza.

Add Pizza as a label to your dataset via the labels list on the left side of the gallery.
Apply the label to the pizzas in the images by selecting all the images with pizza and choosing Draw Bounding Box.

You can use the Shift key to automatically select multiple images between the first and last selected images.

Make sure to draw a bounding box that covers the pizza as tightly as possible.

Training your model

After you label your images, you’re ready to train your model.

Choose Train Model.
For Choose project, choose your PizzaDetection project.
For Choose training dataset, choose your PizzaImages dataset.

As part of model training, Amazon Rekognition Custom Labels requires a labeled test dataset. Amazon Rekognition Custom Labels uses the test dataset to verify how well your trained model predicts the correct labels and generate evaluation metrics. Images in the test dataset are not used to train your model and should represent the same types of images you will use your model to analyze.

For Create test set, choose how you want to provide your test dataset.

Amazon Rekognition Custom Labels provides three options:

Choose an existing test dataset
Create a new test dataset
Split training dataset

For this post, we select Split training dataset and let Amazon Rekognition hold back 20% of the images for testing and use the remaining 80% of the images to train the model.

Our model took approximately 1 hour to train. The training time required for your model depends on many factors, including the number of images provided in the dataset and the complexity of the model.

When training is complete, Amazon Rekognition Custom Labels outputs key quality metrics including F1 score, precision, recall, and the assumed threshold for each label. For more information about metrics, see Metrics for Evaluating Your Model.

Looking at our evaluation results, our model has a precision of 1.0, which means that no objects were mistakenly identified as pizza (false positives) in our test set. Our model did miss some pizzas in our test set (false negatives), which is reflected in our recall score of 0.81. You can often use the F1 score as an overall quality score because it takes both precision and recall into account. Finally, we see that our assumed threshold to generate the F1 score, precision, and recall metrics for Pizza is 0.61. By default, our model returns predictions above this assumed threshold. We can increase the recall for this model if we lower the confidence threshold. However, this would most likely cause a drop in precision.

We can also choose View Test Results to see how our model performed on each test image. The following screenshot shows an example of a correctly identified image of pizza during the model testing (true positive).

Testing your model

Your custom pizza detection model is now ready for use. Amazon Rekognition Custom Labels provides the API calls for starting, using and stopping your model; you don’t need to manage any infrastructure. The following screenshot shows the API calls for using the model.

By using the API, we tried our model on a new test set of images from pexels.com.

For example, the following image shows a pizza on a table with other objects.

The model detects the pizza with a confidence of 91.72% and a correct bounding box. The following code is the JSON response received by the API call:

{
    "CustomLabels": [
        {
            "Name": "Pizza",
            "Confidence": 91.7249984741211,
            "Geometry": {
                "BoundingBox": {
                    "Width": 0.7824199795722961,
                    "Height": 0.3644999861717224,
                    "Left": 0.11868999898433685,
                    "Top": 0.37672001123428345
                }
            }
        }
    ]
}

The following image has a confidence score of 98.40.

The following image has a confidence score of 96.51.

The following image has an empty JSON result, as expected, because the image doesn’t contain pizza.

The following image also has an empty JSON result.

In addition to using the API, you can also use the Custom Labels Demonstration. This AWS CloudFormation template enables you to set up a custom, password-protected UI where you can start and stop your models and run demonstration inferences.

Conclusion

In this post, we showed you how to create a single class object detection model with Amazon Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?

About the Author

Woody Borraccino is a Senior AI Solutions Architect at AWS.

Anushri Mainthia is the Senior Product Manager for Amazon Rekognition and product lead for Amazon Rekognition Custom Labels. Outside of work, Anushri loves to cook, spend time with her family, and binge watch British mystery shows.