Create audio for content in multiple languages with the same TTS voice persona in Amazon Polly

June 2025: This post was reviewed and updated for accuracy.

Amazon Polly is a leading cloud-based service that converts text into lifelike speech. Following the adoption of Neural Text-to-Speech (NTTS), we have continuously expanded our portfolio of available voices in order to provide a wide selection of distinct speakers in supported languages. On top of the four previous additions: Pedro speaking US Spanish, Daniel speaking German, Liam speaking Canadian French, and Arthur speaking British English, we are pleased to announce three new additions: Andrés speaking Mexican Spanish, Sergio speaking European Spanish and Rémi speaking French. As with all the Neural voices in our portfolio, these voices offer fluent, native pronunciation in their target languages. However, what is unique about these seven voices is that they are all based on the same voice persona.

Pedro, Daniel, Liam, Arthur, Andrés, Sergio and Rémi are all modelled on an existing US English Matthew voice. While customers continue to appreciate Matthew for his naturalness and professional-sounding quality, the voice has so far exclusively served English-speaking traffic. Now, using deep-learning methods, we decoupled language and speaker identity, which allowed us to preserve native-like fluency across many languages without having to obtain multilingual data from the same speaker. In practice, this means that we transferred the vocal characteristics of the US English Matthew voice to US Spanish, German, Canadian French, British English, Mexican Spanish, European Spanish and French opening up new opportunities for Amazon Polly customers.

Having a similar-sounding voice available in eight locales unlocks great potential for business growth. First of all, customers with a global footprint can create a consistent user experience across languages and regions. For example, an interactive voice response (IVR) system that supports multiple languages can now serve different customer segments without changing the feel of the brand. The same goes for all other TTS use cases, such as voicing news articles, education materials, or podcasts.

Secondly, the voices are a good fit for Amazon Polly customers who are looking for a native pronunciation of foreign phrases in any of the eight supported languages.

Thirdly, releasing Pedro, Daniel, Liam, Arthur, Andrés, Sergio and Rémi serves our customers who like Amazon Polly NTTS in US Spanish, German, Canadian French, British English, Mexican Spanish, European Spanish and French but are looking for a high-quality masculine voice—they can use these voices to create audio for monolingual content and expect top quality that is on par with other NTTS voices in these languages.

Lastly, the technology we have developed to create the new male NTTS voices can also be used for Brand Voices. Thanks to this, Brand Voice customers can not only enjoy a unique NTTS voice that is tailored to their brand, but also keep a consistent experience while serving an international audience.

Example use case

Let’s explore an example use case to demonstrate what this means in practice. Amazon Polly customers familiar with Matthew can still use this voice in the usual way by choosing Matthew on the Amazon Polly console and entering any text they want to hear spoken in US English. In the following scenario, we generate audio samples for an IVR system (“For English, please press one”):

Thanks to this release, you can now expand the use case to deliver a consistent audio experience in different languages. All the new voices are natural-sounding and maintain a native-like accent.

To generate speech in British English, choose Arthur (“For English, please press one”):
To use a US Spanish speaker, choose Pedro (“Para español, por favor marque dos”):
Daniel offers support in German (“Für Deutsch drücken Sie bitte die Drei”):
You can synthesize text in Canadian French by choosing Liam (“Pour le français, veuillez appuyer sur le quatre”):
To synthesize text in Mexican Spanish, choose Andrés (“Para español, por favor marque cinco”):
To synthesize text in European Spanish, choose Sergio (“Para español, por favor marque seis”):
To synthesize text in French, choose Rémi (“Pour le français, veuillez appuyer sur le sept”):

Note that apart from speaking with a different accent, the UK English Arthur voice will localize the input text differently than the US English Matthew voice. For example, “1/2/22” will be read by Arthur as “the 1st of February 2022,” whereas Matthew will read it as “January 2nd 2022.”

Now let’s combine these prompts:

Conclusion

Pedro, Daniel, Andres, Sergio and Remi are available as both Generative and Neural TTS voices, whereas Arthur and Liam are available as Neural TTS voices only, so in order to enjoy them, you need to use the Generative engine in one of the AWS Regions supporting NTTS or Neural engine in one of the AWS Regions supporting NTTS. The fact that their personas are consistent across languages is an additional benefit, which we hope will delight customers working with content in multiple languages. For more details, review our full list of Amazon Polly text-to-speech voices , Neural TTS pricing, service limits, FAQs, and visit our pricing page.

About the Authors

Patryk Wainaina is a Language Engineer working on text-to-speech for English, German, and Spanish. With a background in speech and language processing, his interests lie in machine learning as applied to TTS front-end solutions, particularly in low-resource settings. In his free time, he enjoys listening to electronic music and learning new languages.

Marta Smolarek is a Senior Program Manager in the Amazon Text-to-Speech team, where she is focused on the Contact Center TTS use case. She defines Go-to-Market initiatives, uses customer feedback to build the product roadmap and coordinates TTS voice launches. Outside of work, she loves to go camping with her family.

Liang Pan is a Solutions Architect at AWS with an extensive background in AI/ML. He is a builder at heart coming from a Software Engineering and Product Management background and a passionate technologist. He loves building and tinkering with new technologies. Outside of work, he likes to stay active and travel.

Nishant Dhiman is a Senior Solutions Architect at AWS with an extensive background in Serverless, Generative AI, Security and Mobile platform offerings. He is a voracious reader and a passionate technologist. He loves to interact with customers and believes in giving back to community by learning and sharing. Outside of work, he likes to keep himself engaged with podcasts, calligraphy and music.

Audit History

Last reviewed and updated in June 2025 by Liang Pan | Solutions Architect and Nishant Dhiman | Sr. Solutions Architect

Artificial Intelligence

Create audio for content in multiple languages with the same TTS voice persona in Amazon Polly

Example use case

Conclusion

About the Authors

Audit History

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help