Deploy high-quality, natural-sounding human voices in dozens of languages

Customize and control speech output that supports lexicons and Speech Synthesis Markup Language (SSML) tags.

Store and redistribute speech in standard formats like MP3 and OGG.

Quickly deliver lifelike voices and conversational user experiences in consistently fast response times.

How it works

Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech, so you can convert articles to speech. With dozens of lifelike voices across a broad set of languages, use Amazon Polly to build speech-activated applications.

Use cases

Generate speech in dozens of languages

Add speech to applications with a global audience, such as RSS feeds, websites, or videos.

Engage customers with a natural-sounding voice

Store and replay Amazon Polly speech output to prompt callers through interactive or automated voice response systems.

Adjust speaking style, speech rate, pitch, and loudness

Use SSML, a W3C standard XML-based markup language for speech synthesis applications, to support common SSML tags for phrasing, emphasis, and intonation.

