Amazon Transcribe is a speech to text service that makes it easy for developers to add voice AI to their applications. Transcribe is designed to process audio input from a variety of sources such as microphones, audio or video files and provide high quality transcriptions for search and analysis.
Punctuation and number normalization
Amazon Transcribe automatically adds punctuation and formatting so that the output closely matches the quality of manual transcription at a fraction of the time and expense. Numbers are also transcribed into digits or “normal form” instead of words. Learn more »
Streaming transcription
You can process your existing audio recordings or stream the audio for real-time transcription. Using a secure connection, you can send a live audio stream to the service, and receive a stream of text in response. Learn more »
Timestamp generation
Amazon Transcribe returns a timestamp for each word, so that you can easily find a word or phrase in the original recording or add subtitles to video. Learn more »
Custom vocabulary
You can add new words to the base vocabulary to generate more accurate transcriptions for domain-specific words and phrases like product names, technical terminology, or names of individuals. Learn more »
Vocabulary filtering
You can specify a list of words to remove from transcripts. For example, you can specify a list of profane or offensive words and Amazon Transcribe removes them from transcripts automatically. Learn more »
Recognize multiple speakers
Speaker changes are automatically recognized and attributed in the text to capture scenarios like telephone calls, meetings, and television shows accurately. Learn more »
Channel identification
Contact centers can submit a single audio file to Amazon Transcribe, and the service will identify produce a single transcript annotated by channel labels automatically. Learn more »
Automatic content redaction
When instructed, Amazon Transcribe can identify and redact sensitive personally identifiable information (PII) from the supported language transcripts. This allows contact centers to easily review and share the transcripts for customer experience insight and agent training. Learn more »
Custom language models
When needed, you can build and train your own custom language model (CLM) by submitting a corpus of text data to Amazon Transcribe. Using that data and our underlying speech recognition models, Amazon Transcribe will generate a CLM tailored for your use case and domain. CLM would be a suitable feature for enhancing speech recognition accuracy when you have large amounts of text data in a certain domain that matches your audio data. This may include archived transcribed logs of call center interactions, subtitled videos, customers’ websites, in-house training manuals, textbooks, and many other data sources. Learn more »
Automatic language identification
Amazon Transcribe can automatically identify the dominant language in an audio file and generate transcriptions. This is useful when your media library contains audio files in different languages. You can also use this feature for media content classification and verify that the main spoken language in your videos and podcasts is correctly labeled. Learn more »
Amazon Transcribe Medical
Amazon Transcribe Medical is a HIPAA-eligible automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capabilities to their healthcare and life science applications.
Dictation mode
Accurately transcribe single-speaker audio commonly found in medical dictation use cases. Learn more »
Conversational mode
Accurately transcribe multi-speaker conversational audio consisting of clinicians and/or patients alike. Learn more »
Medical specialties
Transcribe speech to text across a diverse range of medical specialties. Learn more »
Batch API
Transcribe recorded medical audio files at scale with high concurrency. Learn more »
Streaming API
Transcribe audio streams in near real time via either WebSocket Secure or HTTP/2 protocols. Learn more »
Custom vocabulary
Boost transcription accuracy by using custom vocabulary for potentially out-of-lexicon terminology. Learn more »
Channel identification
Concurrently transcribe multi-channel audio at no extra charge. Get one final coherent transcript. Learn more »
Speaker diarization
Separate speech from different speakers within any mono-channel audio. Learn more »

Get started building with Amazon Transcribe in the AWS Management Console.