Amazon Transcribe Medical now supports custom vocabulary

Posted on: Apr 29, 2020

Amazon Transcribe Medical is a HIPAA eligible automatic speech recognition (ASR) service that makes it easy for developers to add medical speech-to-text capabilities to their healthcare and life science applications. Starting today, users can give Amazon Transcribe Medical more information about how to process speech from audio content by creating a custom vocabulary. A custom vocabulary is a list of specific words that you want Amazon Transcribe Medical to recognize. These can be domain-specific words and phrases, such as medicine names, healthcare brands, or even terms related to procedures that aren’t already recognized out of the box.  

Using custom vocabulary is easy and straightforward. Simply create a list of custom terms or phrases in a plain text file and upload it to an Amazon S3 bucket. Then, before starting a transcription job using Amazon Transcribe Medical, point the service to reference that custom vocabulary. Custom vocabulary not only allows you to add out-of-lexicon terms, but also lets you add custom pronunciations associated with each term, by using the International Phonetic Alphabet (IPA). Additionally, you can now designate exactly how a custom terminology should be displayed when it is transcribed (e.g. “adenosine triphosphate” as “ATP”) by using the built-in custom display forms capability.  

Custom vocabulary is available for both Amazon Transcribe Medical’s synchronous (streaming) API as well as the asynchronous (batch) API. The feature is available in all AWS regions where the service is. Try out the new custom vocabulary feature by visiting the Amazon Transcribe Medical service console or learn more by seeing this technical documentation.