TwelveLabs’ Marengo Embed 3.0 for advanced video understanding now in Amazon Bedrock
TwelveLabs' Marengo Embed 3.0 is now available on Amazon Bedrock, bringing advanced video-native multimodal embedding capabilities to developers and organizations working with video content. Marengo embedding models unify videos, images, audio, and text into a single representation space, enabling you to build sophisticated video search and content analysis applications for any-to-any search, recommendation systems, and other multimodal tasks with industry-leading performance.
Marengo 3.0 delivers several key enhancements. Extended video processing capacity: process up to 4 hours of video and audio content and files up to 6GB—double the capacity of previous versions—making it ideal for analyzing full sporting events, extended training videos, and complete film productions. Enhanced sports analysis: the model delivers significant improvements with better understanding of gameplay dynamics, player movements, and event detection. Global multilingual support: expanded language capabilities from 12 to 36 languages, enabling global organizations to build unified search and retrieval systems that work seamlessly across diverse regions and markets. Multimodal search precision: combine images and descriptive text in a single embedding request, merging visual similarity with semantic understanding to deliver more accurate and contextually relevant search results.
AWS is the first cloud provider to offer TwelveLab’s Marengo 3.0 model, now available in US East (N. Virginia), Europe (Ireland), and Asia Pacific (Seoul). The model supports synchronous inference for low-latency text and image embeddings, and asynchronous inference for processing for video, audio, and large-scale image files. To get started, visit the Amazon Bedrock console. To learn more, read product page, and documentation.