Posted On: Dec 7, 2022

Today, we are excited to announce that Amazon Transcribe Custom Language Models (CLM) now support German and Japanese languages in both batch and streaming mode. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capabilities to your applications. CLM allows you to use pre-existing data to build a custom speech engine for your specific batch and streaming transcription use cases. No prior machine learning experience is required to create your CLM.

CLM uses text data that you already possess, such as website content, instruction manuals, and other assets that cover your domain’s unique lexicon and vocabulary. Upload your training dataset to create a CLM and run transcription jobs using your new CLM. Amazon Transcribe CLM is meant for customers who operate in domains as diverse as law, finance, hospitality, insurance, and media. CLMs are designed to improve transcription accuracy for domain-specific speech. This includes any content outside of what you would hear in normal, everyday conversations. For example, if you're transcribing the proceedings from a scientific conference, a standard transcription is unlikely to recognize many of the scientific terms used by presenters. Using Amazon Transcribe CLM, you can train a custom language model to recognize the specialized terms used in your discipline.

CLM now supports German and Japanese for batch and streaming transcriptions and is available in all AWS Regions where Amazon Transcribe operates. To start building your own custom speech recognition model, log in to the Amazon Transcribe service console. For more details about the CLM feature, visit the “Building custom language models to supercharge speech-to-text performance for Amazon Transcribe” post. You can learn more by checking out the Amazon Transcribe documentation page.