AWS for M&E Blog

Integrating the Content Localization on AWS solution into your media creation and distribution workflow

Content creators and distributors seeking to deliver video content to worldwide customers have more options than ever before. However, even as tools and services for simplifying the distribution of video streams have become more readily available, the process of localizing content to make it understandable to a worldwide multi-lingual audience remains a challenge for many creators and distributors.

Subtitles are the transcribed or translated spoken text displayed over a video. And although subtitle text files themselves are simple, the process of creating multi-lingual subtitle files presents logistical, resource, and timeline challenges, particularly when using traditional and highly manual subtitle creation workflows.

Customers producing and distributing video content have long had the need for a simpler, faster, more manageable, and cost-effective way to generate subtitles. To answer this need, Amazon Web Services (AWS) built the Content Localization on AWS solution. This solution taps into automatic speech recognition (ASR) capabilities offered by Amazon Transcribe to convert spoken word within video files to text. The solution then uses Amazon Translate and its neural machine translation technology to convert transcribed text into other languages. This solution automates the generation and orchestration of subtitles and simplifies the review and editing process with an easy-to-use browser-based interface. And best of all, you can deploy and start using this solution in just a few mouse clicks.

This post walks through the steps content creators and distributors can take to begin using the Content Localization on AWS solution within their existing workflows.

A screenshot of the Content Localization on AWS web application interface

Challenges of localizing video content

Subtitles are an effective way to achieve an additional return on investment (ROI) in your content by maximizing reach to a worldwide audience. A disparity exists between the pace of innovation in the realm of content creation and distribution and the traditional workflows often used for localization. To illustrate this disparity, content creators today can use a service such as Amazon Nimble Studio to deploy and begin operating a virtual production studio in a manner of hours. And video distributors can spin up a simple content distribution network (CDN) using Amazon Simple Storage Service (Amazon S3) and Amazon CloudFront in a less than an hour. However, the content localization process, depending on the length of content, can still take days to weeks to complete.

High-quality subtitle deliverables from the traditional creation process can come at a cost. The significant length of time required for content localization injects stress on the scheduling, budgeting, production, and status tracking of content. Complicating the situation further, it’s not consistently clear that the level of effort and resources required in the traditional process is the right fit for all types of content. Using the Content Localization on AWS solution to shrink end-to-end timelines, reduce budget pressures, and automate the overall creation process allows teams to flexibly determine what level of human-based review is necessary. Customers can begin using the capabilities of Content Localization on AWS in less than 30 minutes, as the following walk-through describes.

The traditional localization process

A traditional localization process typically relies on manual effort during the creation and status tracking of subtitles. While the workflow will differ from organization to organization, many workflows feature some variation of the following steps:

  1. A video creator generates a reference audio/video file
  2. The reference audio/video file is uploaded and delivered to a transcriber
  3. The transcriber reviews the video and creates a timed-text output in the original language
  4. A copy editor proofs the transcription for errors
  5. The source language subtitle is distributed to localization teams who translate the spoken word to the desired languages
  6. The translations are proofread, in parallel, by proofreaders for each language
  7. The final subtitles files are received and uploaded by the content editorial team

A visualized legacy subtitling and translation workflow and waterfall chart

The graphic example in Figure 2 visualizes a generalized localization process based on manual effort. In this example, each step in the process is dependent on the completion of the prior step. These dependencies on the completion of the prior step is one reason for an extended end-to-end timeline of the traditional process. For example, the translation process cannot begin until the transcription and proofreading process is complete. This end-to-end workflow can take days to weeks to complete depending on the length, complexity, backlog of work, and level of review required. For creators facing the pain points detailed previously, the Content Localization on AWS solution offers the ability to automate, accelerate, and simplify the subtitle creation workflow.

 A visualization of how the Content Localization on AWS solution can fit into the subtitling and translation workflow

Using the Content Localization on AWS solution

To begin using the solution, you must have the following prerequisites:

Getting started

Review the solution implementation guide for the latest step-by-step deployment, security considerations, and estimates on costs for this solution. The included AWS CloudFormation infrastructure-as-code template handles the infrastructure building. You will answer several prompts within the browser including: an administrator’s email address, node size selection for Amazon OpenSearch Service, and an acknowledgement that the template will create AWS Identity and Access Management (IAM) resources and may require AWS CloudFormation capabilities. The entire deployment process typically takes less than 30 minutes. The administrator will receive an email with a temporary password when the process is complete. Follow the “Identify the URL” instructions in the implementation guide to access the web application URL.

Using the Content Localization on AWS web application

Once logged in, a user can upload a source audio/video file to transcribe and translate.

A screenshot showing the web application “Upload Content” interface

  1. Drag and drop the input video file from the local computer into the gray box that says “Drop files here to upload”
  2. Click the “Configure Workflow” button
  3. Video operators configuration checkboxes are optional
  4. Configure the workflow to set the source language and target languages for translation
  5. Click the “Upload and Run Workflow” button

An example configuration for a Content Localization on AWS job

The application will now initiate an AWS Step Functions workflow based on the built-in Media Insights on AWS operators that include:

  • Generating a lower bitrate video proxy and thumbnail image
  • Creating an audio-only file using AWS Elemental MediaConvert
  • Transcribing the speech-to-text audio file using Amazon Transcribe
  • Translating the transcript to destination languages using Amazon Translate
  • Converting the transcribed and translated text outputs into *.srt or *.vtt compatible subtitle file formats

The workflow status is viewable by selecting the appropriate job under “Execution History” on the page and selecting “Started”. Once the workflow shows as “Complete”, select “Collection” in the menu bar of the application.

“Collection” is located in the upper corner of the menu bar.

The web application will now load a list of all previously analyzed assets. To view the resulting subtitles and translations, click the “Analyze” link.

The “Analyze” button will reveal the subtitles and translations

Next, click on the “Speech Recognition” and then “Subtitles” tab on the left-hand side of the screen.

The “Subtitles” section that allows for preview and editing of subtitles.

Human-in-the-loop review

Once logged in, you can make any necessary revisions directly within the web browser. For teams looking to leverage human-based review, or human-in-the-loop review, systems administrators can add authorized users to the application. Any changes made to the source language transcript are automatically translated again. Users have the option to save their domain-specific words or phrases as an Amazon Transcribe custom vocabulary. Once created, custom vocabularies are selectable as part of the workflow to improve ASR accuracy for future videos. Similarly, changes to any translations can be saved as Amazon Translate custom terminologies within the solution. Custom terminologies allow users to ensure that brand names, character names, model names, and other unique content will be translated exactly the way they are needed. When all of the necessary changes are complete, users can download their subtitles as industry standard *.srt or *.vtt formats. The subtitles are now ready to upload to their destination video platform for multi-lingual subtitle distribution.

Reviewing the benefits of the automated workflow

At this point, a standard media deliverable can be transcoded, transcribed, translated, and readied for review within the same day thanks to the AWS services underlying the Content Localization on AWS solution. A potential reduction in timeline from weeks to a single day gives content creators and distributors the flexibility they need. With machine-generated subtitle files ready for review, creators and distributors can decide what level of human-based review is necessary based on content type. In some cases, topical content may be publishable as-is while other scripted or sensitive content may require multiple reviews and added subtitle elements. The shortening of the end-to-end timeline is thanks to automation which can reduce the total costs for content localization.

A visualization of the flexible proofreading timeline.

Content localization transformed

This post demonstrates how the Content Localization on AWS solution can transform a legacy localization workflow. The legacy content localization process that previously took days-to-weeks to complete the initial translation can now complete the initial translation in the same day. The reduction in timeline is due to the transcription and translation power of Amazon Transcribe and Amazon Translate paired with the orchestration capability within Content Localization on AWS.

Users can take advantage of more advanced capabilities and integrations. Some examples include starting workflows from the command line, starting workflows with an AWS Lambda function, advanced search capabilities, custom operators, and workflows initiated from Amazon S3 uploads. And because the solution is released under the Apache 2.0 open-source license, customers and partners can customize the solution to meet their needs. AWS has independent software vendor (ISV) and systems integrator (SI) partners that can work with the Content Localization on AWS solution and help you get your subtitle workflow started. And since every localization requirement is different, the Content Localization on AWS solution offers the flexibility to augment and accelerate your content subtitling needs in an easy-to-use, automated, controllable, and cost-effective way.

Wrap up and clean up

If you finish testing the solution and want to avoid incurring future costs, follow the instructions for how to uninstall this solution in the implementation guide.

Conclusion

Users can now leverage the depth and breadth of AWS services not only for content creation and distribution but also for the localization of their content to reach global audiences with the Content Localization on AWS solution to create, revise, and deploy subtitles. The Content Localization on AWS solution allows users to modernize and transform their localization workflow to overcome some of the steepest timeline, workflow, and resource obstacles. Users looking to dive deeper can explore creating subtitles natively within Amazon Transcribe as well as the expanded capabilities offered by Media Insights on AWS. Good luck in your exploration and experimentation as you reimagine your content localization workflow.

Jason O'Malley

Jason O'Malley

Jason O’Malley is a Sr. Partner Solutions Architect at AWS supporting partners architecting media, communications, and technology industry solutions. Before joining AWS, Jason spent 13 years in the media and entertainment industry at companies including Conan O’Brien’s Team Coco, WarnerMedia, and Media.Monks. Jason started his career in television production and post-production before building media workloads on AWS. When Jason isn’t creating solutions for partners and customers, he can be found adventuring with his wife and son, or reading about sustainability.