Overview

This solution creates subtitles for your video-on-demand (VOD) content. Providing localized video with high quality transcriptions, subtitles, and translations can greatly extend the reach of video content to new audiences; it also enhances the understanding of the content by all viewers. However, producing accurate, multi-language subtitles for video is both complex and labor-intensive. Many personnel hours are often spent transcribing, subtitling, translating, and reviewing media assets. Using AWS artificial intelligence (AI) services to assist in the subtitle creation process helps solve these problems.
Benefits

Upload and analyze videos, and work with automatically generated video subtitles using a simple web-based user interface.
Automatically extract valuable metadata from video files using Amazon Rekognition, Amazon Transcribe, Amazon Translate, and Amazon Comprehend.
You can review subtitles and make corrections within the application. Once you are satisfied with the subtitles, rerun the workflow using the corrected input to regenerate downstream results.
Use the application to generate Amazon Transcribe custom vocabularies and Amazon Translate custom terminologies using the corrections you make to the subtitles. Provide these customizations when you upload a video and configure the automated workflow.
Media Insights on AWS is a framework that makes it easier for developers to build serverless applications that process video, images, audio, and text with AI and multimedia services on AWS.
Technical details

This architecture depends on the Media Insights on AWS development framework, which must be deployed in the AWS account in order to deploy the solution. Media Insights on AWS can be deployed separately or together with this solution as an option.
The diagram below presents the serverless architecture you can automatically deploy using the solution's implementation guide and accompanying AWS CloudFormation template.
Step 1
The AWS CloudFormation template deploys an instance of the Media Insights on AWS solution.
Step 2
An Amazon CloudFront distribution to serve the solution’s web application.
Step 3
An Amazon Simple Storage Service (Amazon S3) web source bucket for hosting the static web application.
Step 4
An Amazon Cognito user pool to provide a user directory.
Step 5
An Amazon Cognito identity pool to provide federation with AWS Identity and Access Management (IAM) for authentication and authorization to the web application.
Step 6
Amazon API Gateway endpoints for the Media Insights on AWS workflow API, the Media Insights on AWS data plane API and the Amazon OpenSearch Service API endpoint.
Step 7
An AWS Step Functions workflow created by Media Insights on AWS. The content localization workflow consists of AWS Lambda functions that run jobs in Amazon Transcribe, Amazon Translate, AWS Elemental MediaConvert, and Amazon Polly. These Lambda functions also interact with the Media Insights on AWS data plane API to store and retrieve media objects and metadata returned by media analysis jobs. The workflow can also optionally run Amazon Rekognition and Amazon Comprehend to provide additional analysis of the input.
Step 8
A Lambda function to extract, transform, and load media metadata from the MI data pipeline into an Amazon OpenSearch Service cluster. This Lambda function is invoked by the Media Insights on AWS data plane DynamoDB stream whenever asset metadata is modified in the Media Insights on AWS data plane.
Step 9
An Amazon OpenSearch Service cluster to index media metadata.
Related content
