Make R speak

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Every wanted to make R talk to you? Now you can, with the mscstts package by John Muschelli. It provides an interface to the Microsoft Cognitive Services Text-to-Speech API (hence the name) in Azure, and you can use it to convert any short piece of text to a playable audio file, rendering it as speech using a number of different voices.

Before you can generate speech yourself, you'll need a Bing Speech API key. If you don't already have an Azure account, you can generate a free 7-day API key in seconds by visiting the Try Cognitive Services page, selecting “Speech APIs”, and clicking “Get API Key” for “Bing Speech”:

Bing speech

No credit card is needed; all you need is a Microsoft, Facebook, LinkedIn or Github account. (If you need permanent keys, you can create an Azure account here which you can use for 5000 free Speech API calls per month, and also get $200 in free credits for use with any Azure service.)

Once you have your key (you'll actually get two, but you can use either one) you will call the ms_synthesize function to convert text up to 1,000 characters or so into mp3 data, which you can then save to a file. (See lines 8-10 in the speakR.R script, below.) Then, you can play the file with any MP3 player on your system. On Windows, the easiest way to do that is to call start on the MP3 file itself, which will use your system default. (You'll need to modify line 11 of the script to work non-Windows systems.)

Saving the data to a file and then invoking a player got a bit tedious for me, so I created the say function in the script below to automate the process. Let's see it in action:

Note that you can choose from a number of accents and spoken languages (including British English, Canadian French, Chinese and Japanese), and the gender of the voice (though both female and male voices aren't available for all languages).  You can even modify volume, pitch, speaking rate, and even the pronunciation of individual words using the SSML standard. (This does mean you can't use characters recognized as SSML in your text, which is why the say function below filters out < > and / first.) 

The mscstts package is available for download now from your favorite CRAN mirror, and you can find the latest development version on Github. Many thanks to John Muschelli for putting this handy package together!

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)