AWS Media Blog

Guest post: Using machine learning to get the most from business video

Guest post by Steve Vonder Haar, senior analyst with Wainhouse Research covering the enterprise video industry. The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Low-hanging fruit always makes for the easiest pickings.

Count me “guilty” of this truism in my initial blog post for Amazon Web Services (Machine Learning and Corporate Videos in Today’s Business World) in which I mused about how machine learning can be employed to boost the viewership of corporate video.

Don’t chide me too much for taking the easy route. After all, video is successful only when it draws eyeballs. Leveraging machine learning to create automated subtitles paves the way for on-screen text that enriches the video experience for native speakers and opens the door to translations that make video accessible to viewers around the globe.

Machine learning solutions, however, can be employed to do much more than boost viewership. When applied well, machine learning can play a vital role in mining information and insight that would otherwise be trapped within the confines of large, unwieldly video archives.

Search is the Must Have Feature in Video Workflows

Whether enabling more targeted searches via speech-to-text conversion or digging up relevant information via image recognition tools, machine learning can be employed to unlock insight from video in a fraction of the time that would be required to analyze video content manually.

At times, the result of blending machine learning capabilities with video can be jaw-dropping. Consider, for instance, the emergence of advanced systems that can sift through hours and hours of security camera footage, scan images of criminal suspects and leverage face recognition technologies to determine their identity and then alert on-site personnel of potential threats.

In other cases, machine learning can be used for comparatively mundane applications. Using speech-to-text conversion, for instance, it becomes easier to find specific video passages by conducting a basic search of the transcript. Simply ask for the passage where the CEO says that “everyone is getting a bonus this year,” for example, and a search of auto-generated transcriptions can point you directly to the place in the video archive where the CEO can be seen uttering the fateful phrase.

Even if such a basic search may seem rudimentary when compared with the process of identifying a stranger’s face in the middle of a digital video haystack, it addresses a problem significant to many users of corporate video. An employee’s desire to “find stuff” embedded in business videos should never be underestimated.

Indeed, Wainhouse Research end-user survey results suggest that finding relevant videos is a top priority for business users.  Exactly half of the 2,002 respondents participating in a fourth quarter 2018 end-user survey fielded by Wainhouse Research describe the ability to “search content to find relevant videos” as a “very important” influence on the streaming technology purchase decision. As such, it was cited as a key influencer on streaming technology purchase decisions more often than any other content management issue, including the commonly cited issue of being able to secure content from unauthorized viewers. (Figure 1)

But finding the “right” video or snippet within a video can be harder than it looks. This is particularly true as the size of video archives swell. As the pile of content grows, tracking down a relevant video passage becomes very difficult and time consuming. In fact, according to WR survey results, the level of end-user concern over the issue of “searchability of video content” increases with the size of video archives to which they have access.

As illustrated in Figure 2, almost three-quarters (72%) of those working at organizations with video archives with at least 100 hours of content describe the issue of being able to “search content to find relevant videos” as a “very important” purchase decision influence. Among those at organizations that use archives but have less than 10 hours of stored content, only 52% cite the searchability issue as “very important.” As would be expected, concerns over content searchability dip even further among those at organizations with no archived video.

Search-powered Workflow Optimization

Ultimately, the ability to search content does more than just help employees find what they are looking for. When put to work in business video realms, effective search tools also make it possible to manage content in ways that make it even more useful in day-to-day corporate settings.

Consider the case of a company that needs to eliminate references to a specific phrase from its video archives on short notice. Machine learning makes it possible for organizations to engage in a search for offending content and trigger instructions to automatically delete the scene or the video associated with the phrase.

Similarly, machine learning can be used to identify and re-package content based on tags automatically generated for video content. Using facial recognition, for instance, machine learning systems can simplify the process of pulling together all presentations or in-meeting statements made by a particular individual during a specified time period.

In short, video becomes nimbler with machine learning. Video content can be searched, sorted and re-packaged based on information automatically culled from the videos themselves. In the process, video becomes more than something we watch. Rather, it becomes a tool for capturing the conversations and input that go into making business decisions and then memorializing that data in a highly accurate manner that is easy to access.

The Future State of Evolved Video Workflows

Over time, the age of nimble video may open even more opportunities for boosting the value of this type of content in the workplace. Data derived from “traditional” video presentations, for instance, may be collected and re-purposed for use in augmented reality or virtual reality settings.

Consider the example in which you would record video of a tour guide in New York’s Times Square. Stories told about specific landmarks could be captured and converted to text using machine learning. One place for re-purposing the guide’s information could be an augmented reality application offering self-guided tours of Times Square. In this digital tour, the guide’s text commentary would appear on-screen to match the images from scenes framed in the device’s viewfinder.

Beyond tourism, the integration of video and machine learning could broaden the scope of information available in any business setting. On the factory floor, for instance, experienced engineers could be recorded as they address specific repair issues. Machine learning could then be used to associate specific tools, repair modules and even experts related to the repair of that equipment, speeding similar repairs in the future. Generalizing this example, machine learning offers a new way to streamline the process of creating, processing and sharing institutional knowledge within an organization.

It is no overstatement to say that the prospect of marrying video and machine learning holds the promise to dramatically change the information landscape over time.

Of course, business executives must regularly deal with the implications of emerging technologies. At the dawn of the Internet itself, corporations had to adopt to the accelerated flow of information enabled by common, interconnected networks.

Today, machine learning is speeding up the manner in which we extract and share information found in videos. Forward-looking organizations are beginning to realize the untapped potential associated with capturing more video and leveraging digital tools that squeeze more information and insight from that video.

The pace of change fostered by the marriage of machine learning and corporate video is only going to accelerate during the next several years. Just as growing awareness of the “World Wide Web” in the mid-1990s ultimately led to remarkable advances in information sharing, the rise of nimble, smart video holds the potential to revolutionize how corporate information is captured and processed.

Ignore this trend at your own peril. Dismissing the implications of blending machine learning and video today would be as unwise as ignoring the impact of the pioneering Netscape web browser in 1995. We may not know the exact destination of this journey, but it’s going to be quite a ride.

AWS Editorial note: for information on AWS’ MediaAnalysis solution which helps customers process, analyze and extract meaningful metadata from their audio, image, and video files in a simple web-based user interface, click here.

John Lai

John Lai

John Lai is an Industry Solutions Marketing Manager for AWS Elemental