a journey

Text-To-Speech

I attempt to get text-to-speech up and running on Linux and/or FreeBSD... an ongoing saga, and not a pretty one at the moment.

Unless you like computer-generated chipmunk sounds.

Giving A Voice To The Robot:

Installing and Playing With Festival on Linux FreeBSD


By Chris Snyder

Installation Notes

First, boy do you need this in package form. Get the RPM, package, port -- whatever you need to not compile from source by yourself. It took me a good half-hour to break down and install the RPM.

Loving the RPM (the one I had, anyway):

If you still use Red Hat 8, you might follow this path, like I did. Oops.
[root@galactron src]# rpm -Uvh /home/csnyder/files/RPMS/festival-1.4.2-12.i386.rpm
Preparing... ########################################### [100%]
1:festival ########################################### [100%]
[root@galactron src]# text2wave -h
/usr/bin/text2wave: line 2: /usr/src/build/137727-i386/BUILD/festival/bin/festival: No such file or directory
/usr/bin/text2wave: line 2: exec: /usr/src/build/137727-i386/BUILD/festival/bin/festival: cannot execute: No such file or directory
[root@galactron src]# mkdir -p /usr/src/build/137727-i386/BUILD/festival/bin/
[root@galactron src]# cd /usr/src/build/137727-i386/BUILD/festival/bin/
[root@galactron bin]# which festival
/usr/bin/festival
[root@galactron bin]# ln -s /usr/bin/festival festival
[root@galactron bin]# text2wave -h
text2wave [options] textfile
Convert a textfile to a waveform
Options
-mode Explicit tts mode.
-o ofile File to save waveform (default is stdout).
-otype Output waveform type: ulaw, snd, aiff, riff, nist etc.
(default is riff)
-F Output frequency.
-scale Volume factor
-eval File or lisp s-expression to be evaluated before
synthesis.


Speech Good. Chipmunk Voice Bad.

Your new friend isn't named festival, actually, s/he's named text2wave, and s/he translates text files (or piped input) into waveform data that you, yes you, can save -- or pipe out to an Ogg Vorbis encoder if you're me, yes me.

But here's a simple test:
cd ~/
cat .bashrc | text2wave -o test.wav
play test.wav
Sounds fine! Graduate to Ogg Vorbis:
cat /tmp/speek.txt | text2wave | oggenc - -o testvox.ogg
Nope. Or not on my system. text2wave seems to lie about it's sampling rate -- it claims to be 16000 when it's really 8000. Neat. Not only that, every time I try to play the Ogg Vorbis file, the audio player hangs. Neat neat. I can't help but notice that the RPM didn't contain the most recent release, maybe that's the problem? Mp3s encoded with lame succumb to the same speedup, no matter how I try to override the sample rate. Ah, Linux audio, so fun.

Screw this. Off I go to a FreeBSD system to build the port.

RPM Was Shady, How About FreeBSD Port?

cd /usr/ports/audio/festival
WITH_OGI=1 make configure

Wouldn't you know it-- some dependency requires X11. I don't want X11, this is a server for crying out loud. Did I mention that nothing is ever easy?

Tune in next week when I actually get this thing working. :-/

Back to Building From Source...

http://festvox.org/packed/festival/1.4.3/
Found a list of recommended files besides the festival and speech-tools sources:
festlex_CMU.tar.gz
festlex_OALD.tar.gz
festlex_POSLEX.tar.gz
festvox_don.tar.gz
festvox_kedlpc16k.tar.gz
festvox_rablpc16k.tar.gz

General instructions: unpack everything, cd speech-tools; ./configure; gmake
cd ../festival; ./configure; gmake

Then test: gmake test

Yes, it takes a long time to build. Find something else to do, at least you're not building XFree86 from source like with the port.

The Result

Well, all the tests were passed, everything seemed to work. But here you go-- this wav plays at double speed on my machine, maybe it's just a Red Hat thing, sound never has worked quite right on this box.

Crap. I've made it work before. But not tonight.

Breaking News!

I got it to work! The output wav needs to be converted to stereo through sox -- something like sox -c 1 output.wav -c 2 outputx2.wav

Then (and only then, it seems) it can be encoded with lame (mp3) or oggenc (ogg vorbis).

As Pat wrote in to point out, this problem with mono audio on Linux is documented in the Festival FAQ.

By Chris Snyder on August 6, 2003 at 11:16pm

jump to top