- By Twelve Labs
Twelve Labs multimodal foundation models create powerful vector embeddings that enable downstream applications. Our Marengo model understands video natively and is able to identify and interpret movements, actions, objects, individuals, sounds, on-screen text, and spoken words just like humans,...