Get Started with Amazon Polly

Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

Amazon Polly delivers the consistently fast response times required to support real-time, interactive dialog. You can cache and save Polly’s speech audio to replay offline or redistribute. And Polly is easy to use. You simply send the text you want converted into speech to the Polly API, and Polly immediately returns the audio stream to your application so your application can play it directly or store it in a standard audio file format, such as MP3.

With Polly, you only pay for the number of characters you convert to speech, and you can save and replay Polly’s generated speech. Polly’s low cost per character converted, and lack of restrictions on storage and reuse of voice output, make it a cost-effective way to enable Text-to-Speech everywhere.

Adobe Flash Player or a modern browser is required to view videos on this site.

Introducing Amazon Polly
Amazon Polly: AWS re:Invent 2016

Announcing Speech Marks and Whispering

2-minute overview of the new Speech Marks and whispering voice features in Amazon Polly (April 2017)

Language Female Male Sample Text
English Joanna Joey Hello. Do you speak a foreign language? One language is never enough.
Danish Naja Mads Hej. Taler du et fremmed sprog? Et sprog er aldrig nok.
Brazilian Portuguese Vitória Ricardo Oi. Você fala algum idioma estrangeiro? Somente um idioma nunca é bastante.
Spanish Penélope Miguel Hola. ¿Hablas algún idioma extranjero? Un solo idioma no es suficiente.
Icelandic Dóra Karl Halló, Hæ talar þú erlent tungumál? Eitt tungumál er aldrei nóg.
Natural Sounding Voices

Natural Sounding Voices

Amazon Polly provides 47 lifelike voices and supports 24 languages, including a wide range of male and female voices with a variety of accents. Polly’s fluid pronunciation of text in multiple languages enables you to deliver high-quality voice output and create applications for global users.

Easy Integration

Easy Integration

Amazon Polly makes it easy to add voice to your website, mobile app, or device. With Polly, you just write the text you want converted to speech to the Polly API and it immediately returns the audio stream. Unlike other solutions that require a lengthy approval process, Polly doesn’t require you to describe how you will use Polly’s speech in your application, and there are no distribution agreements to sign, so you can start right away.

Store and Redistriibute Speech

Store and Redistribute Speech

Unlike other solutions that require a royalty or charge a fee every time you replay previously generated audio, Amazon Polly allows for unlimited replays without any additional fees. These free replays extend to offline use as well. You can create speech files in a variety of standard formats, such as MP3 and OGG, and store these on devices such as a mobile phones or Internet of Things (IoT) devices for offline playback.


Low Cost

Low Cost

Amazon Polly’s pay-as-you-go pricing, low cost per character converted, and unlimited replays make it a cost-effective way to enable speech synthesis in virtually any application.

Fast Response

Fast Response

Delivering lifelike voices and conversational user experiences requires consistently fast response times. Voice-enabled applications need to play synthesized speech without delay. Consider apps that provide spoken directions for navigation, eLearning applications that provide verbal instruction to students, and apps that engage the user through real time dialog. These apps are most effective when responses can start without perceived delays in the conversational flow. Even when you send lengthy text to Polly’s API, it returns the audio to your application as a stream so you can play the voices immediately. These kinds of dynamic, spoken responses require access to a much larger quantity of speech audio than is typically available to store on users’ devices. Amazon Polly is in the cloud, so you have access to a wide variety of synthesized speech. With Polly, your application can provide even more valuable responses that include real-time data.

Amazon Polly makes it easy to add speech to your video, presentation, or online training course. Polly can generate speech in 24 languages, making it easy to add voice to applications with a global audience. With Polly you can read your RSS feed, news, or email, and store synthesized speech in the form of audio files.

Content Creation

“Amazon Polly gives GoAnimate users the ability to immediately give voice to the characters they animate using our platform. This is especially helpful in scenarios where live voice-over is either resource or time prohibitive, such as when developing a video in many languages or within pre-production to speed the approval process. The speech is integrated seamlessly with our rich set of pre-animated assets, which reinforces GoAnimate’s ease-of-use and affords our customers both efficiency and speed to market.”

– Alvin Hung, CEO and founder, GoAnimate

Amazon Polly enables developers to provide their applications with an enhanced visual experience such as speech-synchronized facial animation or karaoke-style word highlighting. Amazon Polly makes it easy to request an additional stream of metadata with information about when particular sentences, words and sounds are being pronounced. Using this metadata stream alongside the synthesized speech audio stream, customers can animate avatars and highlight text as it is currently spoken text in their app.

Content Creation

“We strive to make the cloud-driven classroom more engaging and effective for everyone, including users with reading and writing disabilities. Amazon Polly enhances our learning platform by integrating high-quality Text-to-Speech voices with our suite of AppWriter products. It’s absolutely essential to our users to see real-time highlighting of the text while it is being read aloud. With Speech Marks from Polly, AppWriter can deliver an enhanced reading experience which truly levels the playing field for anyone struggling with reading and writing.”

- Stefan Pal, COO, Wizkids

Amazon Polly makes it easy to add voice to your mobile apps and games. With Polly, you can store standard speech responses on the device, and also enable dynamic, real-time responses such as in-game character dialog, leaderboard rankings, and game invitations.

Mobile and Desktop Apps
The Washington Post

“We’ve long been interested in providing audio versions of our more than 1,200 daily stories, but we found that previously existing text-to-speech solutions were not cost-effective for the speech quality they offered. With the arrival of Amazon Polly and its high-quality voices, we look forward to offering readers more rich and versatile ways to experience our content.”

- Joseph Price, Senior Product Manager, The Washington Post

With Amazon Polly, your customer contact centers can respond with natural sounding voices. You can replay Polly’s speech output through your interactive voice response (IVR) systems. Additionally, you can leverage Polly’s API to deliver automated real-time information such as service status, account and billing inquiries, addresses, and contact information.

Customer Contact Center

Amazon Polly enables new Internet of Things (IoT) use cases by making it easy and inexpensive to add speech to IoT devices. IoT devices can use speech to provide natural responses and notifications, making applications more accessible and allowing users to consume information without having to rely on a screen. With Polly you can generate speech files and store them on your devices for offline playback.

Use AWS Lambda to generate pre-signed Polly URLs based on events from the AWS IoT rules engine, then use Device Gateway to send these URLs to your IoT devices to allow them to request lifelike speech.

Internet of Things (IoT)

Amazon Polly can be used to improve the usability of applications that teach people how to speak new languages. For example, end users can type foreign language phrases into your application, the hear them spoken by a native speaker. Polly supports 24 languages, giving teachers and students plenty of options.

Language Learning

“I can't think of many use cases where accurate pronunciation is more important than when you're learning a new language. We have found that the Amazon Polly voices are not just high in quality, but are as good as natural human speech for teaching a language.”

– Severin Hacker, CTO, Duolingo

With Amazon Polly you can create and distribute accessible information in the form of synthesized speech for visually impaired people. This way you can help people with sight loss to consume various content like news, books or email messages.

Royal National Institute of Blind People

“We are currently using Amazon’s Speech-to-Text technology to create and distribute accessible information in the form of synthesized audio content for our many B2B and B2C customers, including utility companies, financial institutions, and media companies, as well as other customer-facing material such as magazines and publications. With the announcement of Amazon Polly, we’re excited about the ability to provide an even better experience to these customers by delivering incredibly lifelike voices that will captivate and engage our audience.”

– John Worsfold, Solutions Implementation Manager, Royal National Institute of Blind People

It's easy to get started with Polly. Sign in to the console to start generating speech from your own text in a few clicks.

Get Started