AWS for M&E Blog

Delivering low-latency captions and voice translation for live sports, news, and OTT platforms with SyncWords and AWS

This post was co-authored by Giovanni Galvez, VP of Business Development and Strategy, SyncWords.

Customers have asked us how to best enable captions with low latency, especially for live sports and news, while taking advantage of Automatic Speech Recognition (ASR) technologies in the cloud. AWS and SyncWords took on this challenge and enabled a secure, low latency caption embedding workflow, by using the Secure Reliable Transport (SRT) protocol and SyncWords to embed captions and generate audio translations for live broadcasts, without the use of additional hardware.

Prior blogs provide implementations Amazon Web Services (AWS) and SyncWords developed to automate caption and subtitle creation for live broadcasts that deploy HTTP Live Streaming (HLS) workflows. Multi-language automatic captions and audio dubbing made possible for live events with AWS Media Services and SyncWords and Translate live sports automatically to reach international fans with AWS Media Services and SyncWords provide introductions to these implementations and can be read prior to this blog to provide more context.

The following workflow diagram depicts an automated, low latency caption embedding and voice translation deployment, using SyncWords’ SRT-based solution, AWS Elemental MediaConnect for video transport, AWS Elemental MediaLive for video processing, AWS Elemental MediaPackage as an origin and packaging service, and Amazon CloudFront for delivery. By supporting SRT ingest and egress, SyncWords provides users a direct, cloud-based, secure, caption and voice embedding solution that natively integrates with AWS Media Services for video processing workflows in the cloud.

Cloud-based live captioning solution with SyncWords and AWS MediaServices.

With the addition of SRT caller input support in MediaLive, you can directly ingest SyncWords captioned live streams into MediaLive, potentially bypassing MediaConnect if all components are in the same AWS Region and MediaConnect is not needed for other workflow requirements. Below is a simplified workflow diagram using MediaLive SRT caller input.

Cloud based live captioning with SyncWords and MediaLive.

Working seamlessly with MediaConnect and leveraging the AWS global network, you can now use cloud-based Automatic Speech Recognition (ASR), cost-effective caption embedding and voice translation services with SyncWords, and distribute the captioned stream to customers or business partners around the globe, with secure, low latency transport from MediaConnect.

Cross region video transport using AWS Elemenatl MediaConnect.

You can take advantage the following benefits when deploying this workflow in AWS.

  1. Native support for multiple resolutions and codecs

Captioning can be inserted into SRT streams via CEA-608/CTA-708 captioning protocols for SD, HD, and UHD(4K) resolutions. It’s compatible with HEVC and AVC video codecs. There is no transcoding needed when inserting caption data to the elementary video stream in this sub-second speed process.

  1. Captioning at scale with Amazon Elastic Kubernetes Services (Amazon EKS)

SyncWords captioning and voice translation service is built on top of AWS services, deploying Amazon EKS service to automatically scale. It is integrated with MediaLive to support dual pipeline redundancy. Customers can now focus on provisioning captioning services without the burden of managing and provisioning on-premises hardware. The redundancy and scalability of the services help you deliver high-quality, reliable live experiences to your customers.

  1. Monetization using MediaConnect and SRT

Broadcasters can now take the full advantage of MediaConnect and the SRT protocol to securely, transport captioned content, and monetize their live streams, such as live news and sports, to global audiences with native spoken languages. Additionally you can use MediaConnect Entitlements to distribute content to business partners and customers across the globe.

Conclusion­­

This blog describes how to enable automatic captioning for live events, with resilient, low latency SRT streaming workflows using AWS services and SyncWords. To learn more about creating a live event pipeline using AWS, please refer to our Live Streaming on AWS solution. In addition, if you’re interested in getting started with AWS Media Services, please visit the product page. For live event captioning solutions and integration documentation, please visit SyncWords.

Chris Zhang

Chris Zhang

Chris Zhang is a Solutions Architect for AWS Elemental