Amazon Polly is a service that turns text into lifelike speech. Polly includes 47 lifelike voices and support for 24 languages, so you can select the ideal voice and distribute your speech-enabled applications in many countries. With Polly, you just send the text you want converted into speech to the Polly API, and Polly immediately returns the audio stream to your application so you can play it directly or store it in a standard audio file format, such as MP3.
Amazon Polly provides an API that enables you to quickly integrate speech synthesis into your application. You simply send the text you want converted into speech to the Polly API, and Polly immediately returns the audio stream to your application so your application can begin streaming it directly or store it in a standard audio file format, such as MP3.
|Sampling rate||Sample Code
|"Hi. My name is Joanna."||from boto3 import client
polly = client("polly", region_name="us-east-1")
response = polly.synthesize_speech(
Text="Hi. My name is Joanna.",
Amazon Polly includes 47 lifelike voices and support for 24 languages, so you can select the ideal voice and distribute your speech-enabled applications in many countries.
|Portuguese - Iberic||Inês||Cristiano|
|Spanish - Castilian||Conchita||Enrique|
With Amazon Polly, you can stream all kinds of information through your application to users in near real time. You can also choose from various sampling rates to optimize bandwidth and audio quality for your application. Amazon Polly supports MP3, Vorbis, and raw PCM audio stream formats.
Amazon Polly supports Speech Synthesis Markup Language (SSML), a W3C standard, XML-based markup language for speech synthesis applications, and supports common SSML tags for phrasing, emphasis, and intonation. This flexibility helps you create lifelike speech that will attract and hold the attention of your audience.
|This is how I speak normally.||(none)
|I can speak in a higher pitched voice, or I can speak in a lower pitched voice.||<speak>I can speak in a <prosody pitch="high">higher pitched voice</prosody>, or I can speak <prosody pitch="low">in a lower pitched voice</prosody></speak>|
|I can speak really slowly, or I can speak really fast.||<speak>I can speak <prosody rate="x-slow">really slowly</prosody>, or I can speak <prosody rate="x-fast">really fast</prosody></speak>|
|I can also speak very loudly, or I can speak very quietly.||<speak>I can also speak <prosody volume="x-loud">very loudly</prosody>, or I can speak <prosody volume="x-soft">very quietly</prosody>. </speak>|
Amazon Polly supports all the programming languages included in the AWS SDK (Java, Node.js, .NET, PHP, Python, Ruby, Go, and C++) and AWS Mobile SDK (iOS/Android). Polly also supports an HTTP API so you can implement your own access layer.
Amazon Polly can be accessed via the Polly API (and various language-specific SDKs), AWS Management Console, and the AWS command-line interface (CLI). You have full control over all the capabilities of Polly, whether you use the service through the console, the API, or the CLI.
With Amazon Polly’s custom lexicons, or vocabularies, you can modify the pronunciation of particular words, such as company names, acronyms, foreign words and neologisms (e.g., “ROTFL”, “C’est la vie” when spoken in a non-French voice). To customize these pronunciations, you upload an XML file with lexical entries. For example, you can customize the pronunciation of Nguyen by providing a phoneme using this XML: