Perform text to speech in Ubuntu Linux using eSpeak, a compact open source software speech synthesizer for English, for Linux. eSpeak does text to speech synthesis and can translate text into speech.
eSpeak uses a “formant synthesis” method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings.
eSpeak is available as:
- A command line program (Linux and Windows) to speak text from a file or from stdin.
- A shared library version for use by other programs. (On Windows this is a DLL).
- A SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface.
- eSpeak has been ported to other platforms, including Android, Mac OSX and Solaris.
- Includes different Voices, whose characteristics can be altered.
- Can produce speech output as a WAV file.
- SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML.
- Compact size. The program and its data, including many languages, totals about 2 Mbytes.
- Can be used as a front-end to MBROLA diphone voices, see mbrola.html. eSpeak converts text to phonemes with pitch and length information.
- Can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.
- Potential for other languages. Several are included in varying stages of progress. Help from native speakers for these or other languages is welcome.
- Development tools are available for producing and tuning phoneme data.
- Written in C.
Run the following commands in terminal to install eSpeak:
$ sudo apt-get update
$ sudo dnf install espeak
Linux users can also install eSpeak from http://espeak.sourceforge.net/
Once installed, using eSpeak is easy. Simply put the text string using echo parameter. See the below example:
$ echo "This is an example voice." | espeak