Introducing the AWS Content Analysis Solution

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.

The AWS Content Analysis solution is a fully automated content-based video search engine. It quantifies video content using AI services from AWS for computer vision and speech analysis, then catalogs videos so users can browse video collections according to specified search criteria. This solution provides automation that can dramatically reduce the human involvement needed to catalog video archives for search.

This solution is also useful to see the insights AWS AI services generate for your own content at a glance and understand whether those services provide sufficient domain knowledge for your use cases.

With the AWS Content Analysis solution, users can explore questions like:

Does Amazon Rekognition provide labels for the objects I’m looking for?
Does Amazon Transcribe recognize the speech in my videos?
Does Amazon Translate accurately interpret the transcribed speech in my videos?

This solution processes videos using the following AWS services:

Thumbnail and audio extraction using AWS Elemental MediaConvert
Object, celebrity, face detection, face search, and explicit content detection using Amazon Rekognition
Transcript generation using Amazon Transcribe
Translation of the transcript using Amazon Translate
Key phrase detection and other textual analysis of the transcript using Amazon Comprehend

Prior to uploading videos in the AWS Content Analysis web application, users can select which AWS AI services to enable.

The entire set of selectable services is shown in this table:

Image of table showing selectable AI services

Video analytics

The AWS Content Analysis solution integrates the data generated by those services into interactive visualizations that allow users to see bounding boxes for selected objects, survey objects in video timelines, read auto-generated transcripts, generate translations, and more.

Image showing AWS Content Analysis workflow

The fidelity of the data collected for videos using this solution facilitates detailed analysis on a granular level. The following is a chart that shows the amount of data recorded for a two-minute scene in one of my favorite movies, The Big Lebowski. This two-minute clip produced a total of about 18,000 data records—to give some perspective on the quantity of data used for cataloging videos:

Image showing data points recorded from a two minute clip

The AWS Content Analysis solution can also process full length movies. For example, the movie Amélie, which is two hours long, produced a total of 652,000 data records, as depicted in the following chart:

Chart showing data records produced from the movie Amelie

Video search

Videos are indexed and cataloged in an Amazon Elasticsearch instance. Everything you see when analyzing videos in the GUI is searchable using the standard Elasticsearch query language, also known as Lucene. This section provides a few examples of common search patterns.

Full text search

Full text queries enable you to search any data in the video catalog. For example, the Amazon Rekognition celebrity detection service will return the full

names of celebrities detected in a video. You can search for a celebrity simply by typing their name, as shown in these screenshots:

Image showing the AWS Content Analysis window, and the Media Collection search banner

Image of actor John Krasinski being detected with Amazon Rekognition

Search high confidence data

The labels returned by Amazon Rekognition are assigned a confidence value that indicates how sure you can be that it is accurate. You can use that value to filter searches results. For example, Violence AND Confidence:>80 will search for videos containing violence with an 80% or higher confidence threshold.

Search data from individual operators

Searches will query the entire metadata catalog in Elasticsearch. A basic search for Violence would match videos containing “Violence” labels from content moderation, but it would also match videos with transcripts that contains the word “Violence.” You can restrict your search to only content moderation results with operator names, like this: Operator:content_moderation AND (Name:Violence AND Confidence:>80).

The following is a full list of operator names you can use to filter search queries:

label_detection
celebrity_detection
content_moderation
face_detection
face_search
transcribe
key_phrases
entities

Search related concepts across multiple operators

As an example of a compound search that uses multiple operator names, this query that will return “Violence” identified by content moderation and “guns” or “weapons” identified by label detection: (Operator:content_moderation AND Name:Violence AND Confidence:>80) OR (Operator:label_detection AND (Name:Gun OR Name:Weapon))

Take Away

The AWS Content Analysis solution is now generally available. It is designed to help organizations that are currently challenged with maintaining large video collections leverage the power of search for video retrieval. This solution can also help individuals test-drive AI services from AWS with their own video content to better understand the scenarios for which these services can be applied.

For more information about AWS Content Analysis, visit the solution page.

AWS for M&E Blog