Page 1 of 1
Voices Garbled?
Posted:
Sun Apr 01, 2007 5:20 pm
by nabaztalk
Okay, I downloaded the demo, using Diane, but the quality of the text is garbled.
Can someone tell me if this is what it's supposed to sound like, or is there something on my machine not encoding at the best quality?
http://nabaztalk.com/audio/woodchuck.wav
Thanks,
Roy
Posted:
Wed Apr 04, 2007 12:55 pm
by hailong
Cepstral TTS engine is based on some statistical model. Although it performs quite well in most cases, some specific input of text might generate low quality synthesis.
If you are not satisified with Diane, please try Callie and David. Cepstral has more than 30 voices, I believe there's one that might interest you.
Posted:
Wed Apr 04, 2007 6:59 pm
by nabaztalk
Thanks for the tip. I did some testing on the website with Callie and here's what I came up with.
Here's the text...
"The Economist magazine has an article on Flying wind farms. Mind you, we're not talking about ordinary terrestrial windmills."
Here's the audio...
http://nabaztalk.com/audio/cepstral.wav
It does pretty well until it gets to "talking" and "ordinary", with some strange high pitch parts sneaking in.
What is causing that?
Posted:
Wed May 16, 2007 11:57 am
by Alex
Hello Nabaztalk,
We at Cepstral strive to perfect our voices and are continually working on them to make them better. The February 4.2.0 release actually fixed a large amount of these types of issues for Callie, but one can still find them here and there, as you have with these.
The issue arrives when joined speech units are not as similar as desired, and the difference between them is audible.
Through complex tuning of a voice's units and through advancement of the engine we will continue to improve the voices, and you should see less and less of these small wavers with each passing release!
For now, one way to immediately improve the synthesis (on text you control) is to experiment with different phrasing. Often times adding a comma in an appropriate place can bring out all new units that join better.
Thanks for trying out Cepstral voices!
-Alex
Posted:
Wed May 16, 2007 12:14 pm
by nabaztalk
Alex,
Good to hear (pun intended) that you have made improvements on your text-to-speech engine. My implementation idea was to use your product for encoding RSS feeds to audio, which means not enough of the same words would appear to make it practical to custom define pronunciation rules.
Why not post the same sentence I provided previously in this thread, using your new version?
Roy (Nabaztalk)
Posted:
Thu May 17, 2007 12:24 pm
by Alex
Roy,
The version you were listening to on our demo page is 4.2.0, so it would still have the error. I'll see if I can get a sample wave of the current development version up for you here soon.
Good luck with the TTS-RSS, that certainly adds some flavor to it!
-Alex
Posted:
Thu May 17, 2007 6:37 pm
by nabaztalk
Okay, I am officially confused. Didn't you say version 4.2.0 is the new version, with the improvements?
Posted:
Fri May 18, 2007 10:12 am
by Alex
Sorry for the confusion!
Here's the relevant bit:
The February 4.2.0 release actually fixed a large amount of these types of issues for Callie, but one can still find them here and there, as you have with these.
That is to say that they have improved in the past, and will in the future.
Here's that sample from the in-development version that I had promised:
http://www.cepstral.com/forum/audio/callie_forum01.wav
It sounds a little better to me.
Hope this clarifies,
-Alex
Posted:
Sat May 26, 2007 8:45 pm
by nabaztalk
Thanks for posting the sample. Yep, some of the high pitched 'burps' are gone. So, I'll wait until that version comes out of development and give it another shot.