AWS for M&E Blog

How to: Use Amazon Transcribe and Amazon Kendra to make your media files searchable

Amazon Machine Learning services help you find answers and extract valuable insights from the content of your audio, video, and text files. The ability to extract data from various types of media is becoming even more important as customer demand for content of all types grows and organizations use different content types to engage with their audiences. For example, product documentation is often published in video form instead of text and podcasts are available in place of blog posts. Virtual workplaces have resulted in more audio and video sharing with recorded meetings, calls, and voicemails.

MediaSearch, a new open-source solution, is built to make your media files searchable and consumable in search results. MediaSearch uses Amazon Transcribe to convert media audio tracks to text, and Amazon Kendra to provide intelligent search. Users can find the content they’re looking for, even when it’s embedded in the sound track of your audio or video files. The solution also provides an enhanced Amazon Kendra query application to let users play the relevant section of original media files directly from the search results page.

Watch the overview video and demo, read the full post at, and see the GitHub repository at to learn more.

Use the MediaSearch solution as a starting point for your own workflow, and help make it better by contributing back fixes and features via GitHub pull requests. For expert assistance, reach out to AWS Professional Services or work with an AWS Partner.