by DancingSpotlights » Mon Mar 28, 2011 1:13 pm
I use Cepstral for shorthand practice. Typing a passage and using TTS is much easier than recording a passage a at target speed and using a sound editor to get speeds on either side. We're about half-way through automating it.
The group I'm in needs speeds from 20wpm (beginners) to 200wpm. We usually bracket our target speed with two speeds on either side (usually 10 and 20wpm).
My husband wrote a program that adds <break= > tags to a text file between each word. That worked well enough for my personal use, but shorthand writers usually think in terms of wpm.
We're now experimenting with break times and voice rates to get specific speeds in wpm. Simply setting a low rate results in words that are dragged out to incomprehensibility. Simply setting the break time results in silence with staccato words.
The end goal is an program that, when given a passage and target speeds, will create .txt (or .ssml) files and a .bat to send them all to Swift to create a sound file for each speed.
Unless there is huge demand (unlikely), it will be up to the end user to license their own voices. They will also have to provide their own text files, as most shorthand books are still under copyright. Still, licensing their own voices and typing the passages is inexpensive and easy compared to the other ways we've tried.
Many thanks to the Cepstral support person several years ago who suggested we check the SSML page on Cepstral's site. None of the other TTS companies bothered to respond.