Amazon Transcribe adds support for automatic language identification

Posted on: Sep 15, 2020

Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capabilities to your applications. Today we are excited to announce automatic language identification in Amazon Transcribe. Until now, you were required to manually identify the dominant language in audio recordings in order to use Transcribe APIs. You can now simply provide the audio files and Transcribe will detect the dominant language from the speech signal and generate transcriptions in the identified language.  

If you operate in a country with multiple official languages or across multiple regions, your audio files can contain different languages. With a minimum of 30 seconds of audio, Transcribe can efficiently generate transcripts in the spoken language without needing humans to specify the spoken language. This applies to various use cases such as transcribing customer calls, converting voicemails to text, capturing meeting interactions, tracking user forum communications, or monitoring media content production and localization workflows.

Automatic language identification for batch transcriptions is supported for all 31 languages that are currently supported at no additional cost in the United States (N.Virginia, Ohio, N.California, Oregon, GovCloud US-West), Canada (Central), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo), Europe (Frankfurt, Ireland, London, Paris), Middle East (Bahrain), and South America (Sao Paulo) regions. To get started with this feature, go to the Amazon Transcribe automatic language identification blogpost or visit the Amazon Transcribe documentation page.