AWS Machine Learning Blog

Amazon Rekognition Labels adds 600 new labels, including landmarks, and now detects dominant colors

Amazon Rekognition offers pre-trained and customizable computer vision capabilities to extract information and insights from images and videos. One such capability is Amazon Rekognition Labels, which detects objects, scenes, actions, and concepts in images. Customers such as Synchronoss, Shutterstock, and Nomad Media use Amazon Rekognition Labels to automatically add metadata to their content library and enable content-based search results. TripleLift uses Amazon Rekognition Labels to determine the best moments to dynamically insert ads that complement the viewing experience for the audience. VidMob uses Amazon Rekognition Labels to extract metadata from ad creatives to understand the unique role of creative decision-making in ad performance, so marketers can produce ads that impact key objectives they care about most. Additionally, thousands of other customers use Amazon Rekognition Labels to support many other use cases, such as classifying trail or hiking photos, detecting people or vehicles in security camera footage, and classifying identity document pictures.

Amazon Rekognition Labels delivers 600 new labels, such as landmarks and activities. Amazon Rekognition Labels also improves the accuracy of over 2,000 existing labels, including those with localized bounding boxes. In addition, Amazon Rekognition Labels now supports Image Properties to detect dominant colors of the entire image, image foreground, image background, and objects with localized bounding boxes. Image Properties also measures image brightness, sharpness, and contrast. Lastly, Amazon Rekognition Labels now organizes label results using two additional fields, aliases and categories, supports filtering of those results, and optionally returns video segments timestamps and duration for when the detected labels appear. In the following sections, we review the new capabilities and their benefits in more detail with some examples.

New labels

Amazon Rekognition Labels has added over 600 new labels, expanding the list of supported labels. The following are some examples of the new labels:

  • Popular landmarks – Brooklyn Bridge, Colosseum, Eiffel Tower, Machu Picchu, Taj Mahal, etc.
  • Activities – Applause, Cycling, Celebrating, Jumping, Walking Dog, etc.
  • Damage detection – Car Dent, Car Scratch, Corrosion, Home Damage, Roof Damage, Termite Damage, etc.
  • Text and documents – Bar Chart, Boarding Pass, Flow Chart, Notebook, Invoice, Receipt, etc.
  • Sports – Baseball Game, Cricket Bat, Figure Skating, Rugby, Water Polo, etc.
  • Many more – Boat Racing, Fun, Cityscape, Village, Wedding Proposal, Banquet, etc.

With these labels, customers in image sharing, stock photography, or broadcast media can automatically add new metadata to their content library to improve their search capabilities.

Let’s look at a label detection example for the Brooklyn Bridge.

The following table shows the labels and confidence scores returned in the API response.

Labels Confidence Scores
Brooklyn Bridge 95.6
Bridge 95.6
Landmark 95.6

Note: The bounding box shown in the image above is for the Bridge label.  Bounding boxes are not returned for landmarks.

Improved labels

Amazon Rekognition Labels has also improved the accuracy for over 2,000 labels, including those with localized bounding boxes. The following are some examples of the improved labels:

  • Activities – Diving, Driving, Reading, Sitting, Standing, etc.
  • Apparel and accessories – Backpack, Belt, Blouse, Hoodie, Jacket, Shoe, etc.
  • Home and indoors – Swimming Pool, Potted Plant, Pillow, Fireplace, Blanket, etc.
  • Technology and computing – Headphones, Mobile Phone, Tablet Computer, Reading, Laptop, etc.
  • Vehicles and automotive – Truck, Wheel, Tire, Bumper, Car Seat, Car Mirror, etc.
  • Text and documents – Passport, Driving License, Business Card, Document, etc.
  • Many more – Person, Dog, Kangaroo, Town Square, Festival, Laughing, etc.

Customers in advertising, for example, can use these improved labels, especially the ones detected with bounding boxes to localize more objects of interest.

Image Properties for dominant color detection and image quality

Image Properties is a new capability of Amazon Rekognition Labels for images, and can be used with or without the label detection functionality. Note: Image Properties is priced separately from Amazon Rekognition Labels, and is only available with the updated SDKs. Image Properties is not available for videos.

Dominant color detection

Image Properties identifies dominant colors in an image based on pixel percentages. These dominant colors are mapped to the 140 CSS color palette, RGB, hex code, and 12 simplified colors (green, pink, black, red, yellow, cyan, brown, orange, white, purple, blue, grey). By default, the API returns up to 10 dominant colors unless you specify the number of colors to return. The maximum number of dominant colors the API can return is 12.

When used standalone, Image Properties detects the dominant colors of an entire image as well as its foreground and background. When used together with label detection functionalities, Image Properties also identifies the dominant colors of detected objects with bounding boxes.

Customers in image sharing or stock photography can use dominant color detection to enrich their image library metadata to improve content discovery, allowing their end-users to filter by color or search objects with specific colors, such as “blue chair” or “red shoes.” Additionally, customers in advertising can determine ad performance based on the colors of their creative assets.

Image quality

In addition to dominant color detection, Image Properties also measures image qualities through brightness, sharpness, and contrast scores. Each of these scores ranges from 0–100. For example, a very dark image will return low brightness values, whereas a brightly lit image will return high values.

With these scores, customers in image sharing, advertising, or ecommerce can perform quality inspection and filter out images with low brightness and sharpness to reduce false label predictions.

The following image shows an example with the Eiffel Tower.

The following table is an example of Image Properties data returned in the API response.

Note: The bounding box shown in the image above is for the Tower label. Bounding boxes are not returned for landmarks.

The following image is an example for a red chair.

The following is an example of Image Properties data returned in the API response.


The following image is an example for a dog with a yellow background.

The following is an example of Image Properties data returned in the API response.


New aliases and categories fields

Amazon Rekognition Labels now returns two new fields, aliases and categories, in the API response. Aliases are other names for the same label and categories group individual labels together based on 40 common themes, such as Food and Beverage and Animals and Pets.

With the label detection model update, some existing labels are no longer returned in the primary list of label names. Instead, they are returned in the new aliases field of their primary labels. For example, Amazon Rekognition Labels previously returned aliases like Cell Phone in the same list of primary label names that contained Mobile Phone. Now, because Cell Phone is an alias of Mobile Phone, Amazon Rekognition Labels returns Cell Phone in the aliases field for Mobile Phone.

Customers in photo sharing, ecommerce, or advertising can use aliases and categories to organize their content metadata taxonomy to further enhance content search and filtering:

  • Aliases example – Because Car and Automobile are aliases, you can add metadata to an image with Car and Automobile at the same time
  • Categories example – You can use categories to create a category filter or display all images related to a particular category, such as Food and Beverage, without having to explicitly add metadata to each image with Food and Beverage

The following image shows a label detection example with aliases and categories for a diver.

The following table shows the labels, confidence scores, aliases, and categories returned in the API response.

Labels Confidence Scores Aliases Categories
Nature 99.9 Nature and Outdoors
Water 99.9 Nature and Outdoors
Scuba Diving 99.9 Aqua Scuba Travel and Adventure
Person 99.9 Human Person Description
Leisure Activities 99.9 Recreation Travel and Adventure
Sport 99.9 Sports Sports

The following image is an example for a cyclist.

The following table contains the labels, confidence scores, aliases, and categories returned in the API response.

Labels Confidence Scores Aliases Categories
Sky 99.9 Nature and Outdoors
Outdoors 99.9 Nature and Outdoors
Person 98.3 Human Person Description
Sunset 98.1 Dusk, Dawn Nature and Outdoors
Bicycle 96.1 Bike Hobbies and Interests
Cycling 85.1 Cyclist, Bike Cyclist Actions

Inclusion and exclusion filters

Amazon Rekognition Labels introduces new inclusion and exclusion filtering options in the API input parameters to narrow down the specific list of labels returned in the API response. You can provide an explicit list of labels or categories that you want to include or exclude. Note: These filters are available with the updated SDKs.

Customers can use inclusion and exclusion filters to obtain specific labels or categories they are interested in without having to create additional logic in their application. For example, customers in insurance can use LabelCategoriesInclusionFilter to only include label results in the Damage Detection category.

The following code is an API sample request with inclusion and exclusion filters:

{
    "Image": {
        "S3Object": {
            "Bucket": "bucket",
            "Name": "input.jpg" 
        } 
    },
    "MaxLabels": 10, 
    "MinConfidence": 75,
    "Features": [ "GENERAL_LABELS", "IMAGE_PROPERTIES" ],
    "Settings": {
        "GeneralLabels": {
            "LabelsInclusionFilter": [<Label(s)>],
            "LabelsExclusionFilter": [<Label(s)>],
            "LabelCategoriesInclusionFilter": [<Category Name(s)>],
            "LabelCategoriesExclusionFilter": [<Category Name(s)>] 
        },
        "ImageProperties": {
            "MaxDominantColors":10
        }
    }
 }

The following are examples of how inclusion and exclusion filters work:

  • If you only want to detect Person and Car, and don’t care about other labels, you can specify [“Person”,”Car”] in LabelsInclusionFilter.
  • If you want to detect all labels except for Clothing, you can specify [“Clothing”] in LabelsExclusionFilter.
  • If you want to detect only labels within the Animal and Pets categories except for Dog and Cat, you can specify ["Animal and Pets"] in the LabelCategoriesInclusionFilter, with ["Dog", "Cat"] in LabelsExclusionFilter.

Note: Aliases cannot be used as values for these filters.

Customers can use inclusion and exclusion filters to obtain specific labels or categories they are interested in without having to create additional logic in their application. For example, customers in insurance can use LabelCategoriesInclusionFilter to only include label results in the Damage Detection category.

Video segments for detected labels

Amazon Rekognition Labels introduces an additional API input parameter for videos, AggregateBy = SEGMENT, to organize label results by video segments for when a label is detected across multiple consecutive frames. A video segment is defined by a start timestamp, an end timestamp, and a duration. Customers in media, for example, can use this capability to add video segments metadata to their media content and quickly search segments that contain the desired labels for the desired length. Now, they can achieve this without developing custom post-processing logic based on individual label timestamps returned in the default API response. Note: This input parameter is only available for videos with the updated SDKs.

The following code is an API sample request and response with AggregateBy = SEGMENT enabled:

# Sample Request
{
    "JobId": "5a6e690e-c750-460a-9d59-c992e0ec8638",
    "MaxResults": 10,
    "SortBy": "TIMESTAMP",
    "AggregateBy": "SEGMENTS"
}

# Sample Response
{  
    "JobStatus": "SUCCEEDED",
    "LabelModelVersion": "3.0",
    "Labels": [ 
        {
            "StartTimestampMillis": 225,
            "EndTimestampMillis": 3578,
            "DurationMillis": 3353,
            "Label": {
                "Name": "Car",
                "Categories": [
                  {
                    "Name": "Vehicles and Automotive"
                  }
                ],
                "Aliases": [
                  {
                    "Name": "Automobile"
                  }
                ],
                "Parents": [
                  {
                    "Name": "Vehicle"
                  }
                ],
                "Confidence": 99.9364013671875
            }
        },
        {
            "StartTimestampMillis": 7578,
            "EndTimestampMillis": 12371,
            "DurationMillis": 4793,
            "Label": {
                "Name": "Kangaroo",
                "Categories": [
                  {
                    "Name": "Animals and Pets"
                  }
                ],
                "Aliases": [
                  {
                    "Name": "Wallaby"
                  }
                ],
                "Parents": [
                  {
                    "Name": "Mammal"
                  }
                ],
                "Confidence": 99.9364013671875
            }
        },
        {
            "StartTimestampMillis": 22225,
            "EndTimestampMillis": 22578,
            "DurationMillis": 2353,
            "Label": {
                "Name": "Bicycle",
                "Categories": [
                  {
                    "Name": "Hobbies and Interests"
                  }
                ],
                "Aliases": [
                  {
                    "Name": "Bike"
                  }
                ],
                "Parents": [
                  {
                    "Name": "Vehicle"
                  }
                ],
                "Confidence": 99.9364013671875
            }
        }
    ],
    "VideoMetadata": {
        "ColorRange": "FULL",
        "DurationMillis": 5000,
        "Format": "MP4",
        "FrameWidth": 1280,
        "FrameHeight": 720,
        "FrameRate": 24
    }
}

Conclusion

Amazon Rekognition Labels delivers 600 new labels, improves accuracy for over 2,000 existing labels, and enhances bounding box detections. Along with these updates, Amazon Rekognition Labels now supports Image Properties, aliases and categories, inclusion and inclusion filters, and video segments timestamps and duration for detected labels.

To try the new label detection model with its new features, log in to your AWS account and check out the Amazon Rekognition console for label detection and image properties. To learn more, visit Detecting labels.


About the authors

Maria Handoko is a Senior Product Manager at AWS. She focuses on helping customers solve their business challenges through machine learning and computer vision. In her spare time, she enjoys hiking, listening to podcasts, and exploring different cuisines.

Shipra Kanoria is a Principal Product Manager at AWS. She is passionate about helping customers solve their most complex problems with the power of machine learning and artificial intelligence. Before joining AWS, Shipra spent over 4 years at Amazon Alexa, where she launched many productivity-related features on the Alexa voice assistant.

Davide Modolo is an Applied Science Manager at AWS AI Labs. He has a PhD in computer vision from the University of Edinburgh (UK) and is passionate about developing new scientific solutions for real-world customer problems. Outside of work, he enjoys traveling and playing any kind of sport, especially soccer.