[SEO Subhead]
This Guidance demonstrates how to accelerate your content analysis workflows by automating video metadata extraction, intelligence gathering, and content moderation. It enables you to efficiently process large volumes of video content, extract valuable insights, and make data-driven decisions through customizable policy evaluations. By automating these traditionally manual tasks, you can reduce operational costs, improve accuracy, and scale your content analysis capabilities while maintaining secure and reliable operations.
Please note: [Disclaimer]
Architecture Diagram

-
Extraction of generic metadata
-
Restful APIs of the extraction service
-
Extraction of generic metadata
-
This architecture diagram shows how to use generative AI to extract generic metadata from videos and demonstrates a dynamic policy evaluation analysis.
Step 1
Media analysts access front end static website through Amazon CloudFront distribution. The static content hosted on Amazon Simple Storage Service (Amazon S3).Step 2
Users log in to the frontend web application, authenticated by an Amazon Cognito user pool.Step 3
Users upload video(s) to Amazon S3 directly from the browser using multi-part, pre-signed Amazon S3 URLs managed by the UI application.
Step 4
The frontend UI interacts with the extract service (microservice) through a RESTful interface provided by Amazon API Gateway. This interface offers Create, Read, Update, Delete (CRUD) features for video task extraction and management. The extraction service can be deployed and used independently of the other components.Step 5
An AWS Step Functions state machine oversees the analysis process. It transcribes audio using Amazon Transcribe, samples image frames from video using moviepy, uses multimodal models on Amazon Bedrock to analyze images, and uses Amazon Rekognition for additional insights. It also generates text and multimodal embeddings on the frame level. Users can customize the logic in this Guidance to integrate their preferred generative AI models.Step 6
Amazon DynamoDB stores media processing task metadata and extracted video information in text format. An Amazon OpenSearch Service cluster stores vector embeddings and facilitates search and discovery needs.Step 7
Using the solution UI, the user selects and customizes existing template prompts, then initiates the policy evaluation utilizing Amazon Bedrock large language models (LLMs) based on the extracted video metadata.
-
Restful APIs of the extraction service
-
This architecture diagram illustrates the key RESTful APIs of the extraction service, served through Amazon API Gateway. The UI uses APIs to retrieve data, allowing users to integrate the extraction service into existing workflows.
Step 1
The /start_task endpoint serves as the core of the extraction service, managing the video metadata extraction process and maintaining the results.
Step 2
DynamoDB stores the extracted metadata. The raw results from generative AI models are saved as JSON or text files in Amazon S3. Amazon OpenSearch Service indexes store frame-level embeddings to serve search.Step 3
The process includes invoking Amazon Transcribe to generate audio transcriptions, sample image frames from the video at a specified interval, and remove similar frames by generating multimodal embeddings and applying similarity comparison. For each image frame, the service applies AI or generative AI features to extract metadata. Additionally, the service generates text and multimodal embeddings for each frame to enable vector search capabilities.
Step 4
Amazon Simple Notification Service (Amazon SNS) notifies downstream workflows of task completion.Step 5
The /get_task endpoint retrieves video task information using a unique task ID. The data is fetched from the DynamoDB tables.
Step 6
The /delete_task endpoint deletes video tasks using a unique task ID. It will delete all the task-related states from DynamoDB tables, Amazon S3, and Amazon OpenSearch indexes.
Step 7
The /search_task endpoint searches for tasks matching the provided criteria. It supports keyword searches against the DynamoDB task name and description, as well as semantic and multimodal embedding searches using the Amazon OpenSearch vector index.
Get Started

Deploy this Guidance
Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon CloudWatch provides logging and insights for the services running in AWS Lambda and Step Functions. This Guidance pushes metrics to CloudWatch at various stages to provide observability into the infrastructure, such as Lambda functions, AI/ML services, and S3 buckets.
-
Security
This Guidance implements least-privilege AWS Identity and Access Management policies and encrypts S3 data using AWS Key Management Service (AWS KMS) keys. User authentication is handled through Amazon Cognito using OAuth patterns for both web application login and API Gateway calls. The OpenSearch service cluster is deployed in an Amazon Virtual Private Cloud (Amazon VPC) private subnet, accessible only to authorized Lambda functions.
-
Reliability
Amazon S3 provides robust data management through version control, deletion prevention, and cross-region replication capabilities. Serverless services like API Gateway, Lambda, Step Functions, and Amazon Simple Queue Service (Amazon SQS) offer built-in scalability and high availability. The OpenSearch Service deployment supports high availability through multiple Availability Zones, featuring redundant data nodes with replicated shards, helping ensure data persistence and recovery capabilities.
-
Performance Efficiency
Lambda and Step Functions enable efficient parallel processing through concurrent execution of functions and workflow steps. This parallel processing capability improves overall throughput and reduces execution time. The serverless architecture automatically handles the complexity of scaling workloads on AWS for optimal performance for media processing tasks.
-
Cost Optimization
Amazon S3 storage classes and lifecycle policies optimize video storage costs, while serverless and AI/ML services operate on a pay-as-you-go model, meaning you only pay for services used. The event-driven architecture helps ensure charges apply only for resources actually used, allowing you to configure and tailor your media workflows cost-effectively while using S3 lifecycle policies to to store and archive ingested contents, proxies, and metadata.
-
Sustainability
AWS serverless services and AI/ML components allocate compute resources dynamically based on demand, eliminating over-provisioning and reducing resource waste. This approach minimizes energy consumption and compared to traditional on-premises servers, while maximizing the efficiency of AWS AI services to reduce the environmental impact of backend operations.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.