Important: Starting on August 30, 2023 Content Analysis on AWS will no longer be supported and the GitHub repository will be archived. Existing deployments will continue to run. If you have deployed Content Analysis on AWS via cloning GitHub open source code, you may continue to use the solution.
The functionality provided by Content Analysis on AWS will be superseded with functionality in Media2Cloud on AWS and Content Localization on AWS. We encourage you to explore these solutions.
Overview

The Content Analysis on AWS solution helps you to perform automated video content analysis using a serverless application model to generate meaningful insights through machine learning (ML) generated metadata. This solution provides access to a variety of AWS AI services that you can apply to your media libraries and then use insights and metadata to automate manual processes. The solution includes a web-based user interface to upload and search your video libraries.
The Content Analysis on AWS solution combines Amazon Rekognition, Amazon Transcribe, Amazon Translate, and Amazon Comprehend to offer a suite of comprehensive capabilities to analyze a customer’s video content. The solution is a tailored application based on the Media Insights on AWS development framework.
The solution provides a single application to apply multiple machine learning services, making it easier for customers to get started using those services. The solution also automates manual processes including metadata generation and searches metadata from multiple machine learning services in a single location.
Benefits

Get highly accurate object, scene, and activity detection; person identification and pathing; and celebrity recognition in videos.
Upload, analyze, and browse video collections immediately using a simple web-based user interface.
Media Insights provides a framework to make it easier for developers to build applications that transform or analyze videos on AWS.
Automate metadata generation and other manual processes using a single application. Dramatically reduce the human involvement needed to catalog video archives for search.
Technical details

The diagram below presents the serverless architecture flow you can automatically deploy using the solution's implementation guide and accompanying AWS CloudFormation template.
Step 1
An Amazon CloudFront distribution to serve the static Content Analysis web application.
Step 2
An Amazon Simple Storage Service (Amazon S3) web source bucket for hosting the static web application.
Step 3
An Amazon Cognito user pool to provide a user directory.
Step 4
An Amazon Cognito identity pool to provide federation with AWS Identity and Access Management (IAM) for authentication and authorization to the web UI.
Step 5
An Amazon API Gateway REST API for the control plane to proxy file uploads and orchestrate workflow operations from the web UI to Amazon S3 and AWS Step Functions. AWS IAM roles are created for the API to operate.
Step 6
An AWS Lambda API handler function to support the control plane REST API.
Step 7
Amazon DynamoDB tables to store system parameters, workflow definitions, workflow status, workflow execution history and other workflow-related data.
Step 7
An AWS Glue workflow activates daily at 1:00 AM (UTC). The workflow starts AWS Glue jobs that process the raw data and store the results in the processed data Amazon S3 bucket. Then, the workflow starts an AWS Glue crawler that updates the AWS Glue Data Catalog.
Step 8
Amazon Simple Queue Service (Amazon SQS) resources to limit the total number of concurrently running workflows to a configurable maximum.
Step 9
A Lambda function for checking and recording the run status of workflows in DynamoDB.
Step 10
Two AWS Step Functions workflows consisting of Lambda functions that run media analysis jobs in Amazon Rekognition, Amazon Transcribe, Amazon Translate, AWS Elemental MediaConvert, and Amazon Comprehend. These Lambda functions also interact with the data plane to store and retrieve media objects and metadata returned by media analysis jobs.
Step 11
An API Gateway REST API for CRUD functionality in the data plane.
Step 12
A Lambda API handler function to support the data plane REST API.
Step 13
A DynamoDB table to record relationships between metadata, media objects, and user-specified media files.
Step 14
An Amazon S3 bucket to store uploaded video files, derived metadata results, and derived media objects like thumbnails, audio files, and transcoded video files.
Step 15
Amazon Kinesis Data Streams resources to provide an interface for Amazon OpenSearch Service to access media metadata via a change data capture stream that reflects CRUD operations to the DynamoDB table.
Step 16
A Lambda function to extract, transform, and load media metadata from the DynamoDB table into an Amazon OpenSearch Service cluster.
Step 17
An Amazon OpenSearch Service cluster to index media metadata.
Related content

We’re pleased to announce the availability of AWS Media Intelligence (AWS MI) solutions, a combination of services that empower you to easily integrate AI into your media content workflows.
This course provides an explanation of how to create a workflow to automate the generation of captions, alternate language subtitles, and alternate language audio tracks using Amazon’s AI Services: Amazon Transcribe, Amazon Translate, and Amazon Polly.
After taking this set of courses, you’ll understand how Artificial Intelligence (AI) led to Machine Learning (ML), which then led to Deep Learning (DL).
This course provides an explanation of how to create a workflow to automate the generation of captions, alternate language subtitles, and alternate language audio tracks using Amazon’s AI Services: Amazon Transcribe, Amazon Translate, and Amazon Polly.