AWS for M&E Blog
Introducing the AWS Content Analysis Solution
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.
The AWS Content Analysis solution is a fully automated content-based video search engine. It quantifies video content using AI services from AWS for computer vision and speech analysis, then catalogs videos so users can browse video collections according to specified search criteria. This solution provides automation that can dramatically reduce the human involvement needed to catalog video archives for search.
This solution is also useful to see the insights AWS AI services generate for your own content at a glance and understand whether those services provide sufficient domain knowledge for your use cases.
With the AWS Content Analysis solution, users can explore questions like:
- Does Amazon Rekognition provide labels for the objects I’m looking for?
- Does Amazon Transcribe recognize the speech in my videos?
- Does Amazon Translate accurately interpret the transcribed speech in my videos?
This solution processes videos using the following AWS services:
- Thumbnail and audio extraction using AWS Elemental MediaConvert
- Object, celebrity, face detection, face search, and explicit content detection using Amazon Rekognition
- Transcript generation using Amazon Transcribe
- Translation of the transcript using Amazon Translate
- Key phrase detection and other textual analysis of the transcript using Amazon Comprehend
Prior to uploading videos in the AWS Content Analysis web application, users can select which AWS AI services to enable.
The entire set of selectable services is shown in this table:
Video analytics
The AWS Content Analysis solution integrates the data generated by those services into interactive visualizations that allow users to see bounding boxes for selected objects, survey objects in video timelines, read auto-generated transcripts, generate translations, and more.
The fidelity of the data collected for videos using this solution facilitates detailed analysis on a granular level. The following is a chart that shows the amount of data recorded for a two-minute scene in one of my favorite movies, The Big Lebowski. This two-minute clip produced a total of about 18,000 data records—to give some perspective on the quantity of data used for cataloging videos:
The AWS Content Analysis solution can also process full length movies. For example, the movie Amélie, which is two hours long, produced a total of 652,000 data records, as depicted in the following chart:
Video search
Videos are indexed and cataloged in an Amazon Elasticsearch instance. Everything you see when analyzing videos in the GUI is searchable using the standard Elasticsearch query language, also known as Lucene. This section provides a few examples of common search patterns.
Full text search
Full text queries enable you to search any data in the video catalog. For example, the Amazon Rekognition celebrity detection service will return the full
names of celebrities detected in a video. You can search for a celebrity simply by typing their name, as shown in these screenshots:
Search high confidence data
The labels returned by Amazon Rekognition are assigned a confidence value that indicates how sure you can be that it is accurate. You can use that value to filter searches results. For example, Violence AND Confidence:>80
will search for videos containing violence with an 80% or higher confidence threshold.
Search data from individual operators
Searches will query the entire metadata catalog in Elasticsearch. A basic search for Violence
would match videos containing “Violence” labels from content moderation, but it would also match videos with transcripts that contains the word “Violence.” You can restrict your search to only content moderation results with operator names, like this: Operator:content_moderation AND (Name:Violence AND Confidence:>80)
.
The following is a full list of operator names you can use to filter search queries:
- label_detection
- celebrity_detection
- content_moderation
- face_detection
- face_search
- transcribe
- key_phrases
- entities
Search related concepts across multiple operators
As an example of a compound search that uses multiple operator names, this query that will return “Violence” identified by content moderation and “guns” or “weapons” identified by label detection: (Operator:content_moderation AND Name:Violence AND Confidence:>80) OR (Operator:label_detection AND (Name:Gun OR Name:Weapon))
Take Away
The AWS Content Analysis solution is now generally available. It is designed to help organizations that are currently challenged with maintaining large video collections leverage the power of search for video retrieval. This solution can also help individuals test-drive AI services from AWS with their own video content to better understand the scenarios for which these services can be applied.
For more information about AWS Content Analysis, visit the solution page.