Object Bounding Boxes and More Accurate Object and Scene Detection are now Available for Amazon Rekognition Video

Posted on: Jan 18, 2019

Amazon Rekognition Video is a deep learning-based video analysis service that can identify objects, people, text, scenes, and activities, as well as detect unsafe content. Object and Scene detection - also called label detection - can identify thousands of common objects and scenes in a video, as well as the timestamp for when each label appears. Amazon Rekognition Video has been updated to provide significantly improved accuracy for all existing labels across a variety of use cases. In addition, label detection can now specify the location of objects such as dogs, people, and cars in a video by returning bounding box for each object. A bounding box is a set of coordinates that precisely indicates a specific object location in a video frame. Customers can use the bounding box information to count objects ("3 cars"), and to understand the relationship between objects ("person next to a car") at a particular timestamp in a video. Lastly, for each label found, Amazon Rekognition Video now returns its parent labels in a hierarchical list. For example, the label 'Dog' has the parents 'Mammal', 'Canine', and 'Animal'. This metadata allows customers to group labels related by parent-child relationships to improve categorization and facilitates easier mapping to in-house taxonomies. No machine learning experience is required to get started.

Bounding boxes, hierarchical metadata, and improved label detection accuracy are available today in all AWS Regions where Amazon Rekognition Video is offered, except for AWS GovCloud (US). You can get started today via the Amazon Rekognition Console. Refer to the technical documentation for more information. Amazon Rekognititon already supports these features for images.

  • Try this 10-minute tutorial to learn how to analyze video and extract rich metadata with Amazon Rekognition Video.