AWS Machine Learning Blog
Create softer speech with the new Amazon Polly phonation tag
Speech Synthesis Markup Language (SSML) is a standardized markup language that enables developers to modify Text-to-Speech (TTS) audio. With SSML, you can control various vocal characteristics of TTS output, such as pronunciation, speech rate, and other elements, to produce a more natural-sounding voice experience.
Today, we are excited to announce a new phonation SSML tag that you can use with Amazon Polly. The new phonation tag enables you to produce a softer dialogue.
Using the new phonation tag
The new amazon:effect tag coupled with the phonation=“soft” tag allows Amazon Polly to generate softer speech. Notice in the sample below, that amazon:effect requires a closing tag. In this case, the first portion of the synthesized speech is spoken with a normal voice, whereas the portion using the phonation tag is spoken more softly.
Listen now Voiced by Amazon Polly |
Copy the example above and paste it into the Amazon Polly console, and try it with any of the Amazon Polly voices.
Amazon Polly supports standard SSML tags such as prosody, which enables you to control the volume, rate, and pitch of the delivery of the text. Amazon Polly also has unique tags you can use for cool effects, such whispered voice, dynamic range compression, and vocal tract length, which further enhance your ability to modify Amazon Polly voices to best suit your needs.
About the Author
Binny Peh is a Sr. Product Marketing Manager for AWS machine learning solutions. In her spare time, she indulges in too much television and is an aspiring foodie. Binny’s glass is always half-full, and she believes in the power of positive thinking.