Posted On: Feb 17, 2020
Amazon Rekognition is a deep learning-based image and video analysis service that can identify objects, people, text, scenes, as well as support content moderation by detecting unsafe content. Starting today, you can detect text in videos and get back the detection confidence, location bounding box as well as the timestamp for each text detection. In addition, text detection in both images and videos now provides convenient options to filter out words by regions of interest (ROIs), word bounding box size, and word confidence score.
Text detection in videos can be leveraged for multiple use cases, particularly in media and entertainment applications. First, you can search for videos or video timestamps where specific keywords appear on screen, for example, 'Breaking News'. Second, for internationalization of content, you can quickly find all instances of text on a program video timeline so that it can be replaced with text in another language. Third, for compliance and moderation use cases, you can detect the presence of accidental text such as burnt in subtitles or flag text containing profanities and hate speech by checking words against a dictionary of blacklisted words and phrases. Lastly, you can use the bounding box location to study the impact of text size and location on a marketing campaign performance, or to position other graphic elements correctly.
Filtering by text region, size and confidence score provides you with additional flexibility to control your text detection output. By using ROIs, you an easily limit text detection to the regions that are relevant to you, for example, a bottom third region for on-screen graphics or a top left corner for reading scoreboards in a soccer game. Word bounding box size filter can be used to avoid small background text which may be noisy or irrelevant. And lastly, word confidence filter enables you to remove results that may be unreliable due to being blurry or smudged.