EN FR
EN FR


Section: New Software and Platforms

SOJA

Speech Synthesis platform in JAva

Keywords: Speech Synthesis - Audio

Scientific Description: SOJA relies on a non-uniform unit selection algorithm. Phonetic and linguistic features are extracted and computed from the text to drive the selection of speech units in a recorded corpus. The selected units are concatenated to obtain the speech signal corresponding to the input text.

Functional Description: SOJA is a software for Text-To-Speech synthesis (TTS). It performs all steps from text input to speech signal output. A set of associated tools is available for elaborating a corpus for a TTS system (transcription, alignment, etc.). Currently, the corpus contains about 3 hours of speech recorded by a female speaker. Most of the modules are in Java, some are in C. The SOJA software runs under Windows and Linux. It can be launched with a graphical user interface or directly integrated in a Java code or by following the client-server paradigm.

Release Functional Description: Version 3.0 integrates a phonetization based on a deep learning algorithm. In addition, the phonetization step is managed by API REST (client/server mode). The NLP part provides an output of descriptors in the format that can be used by HTS and Merlin systems.

News Of The Year: The latest version can use the LORIA-PHON deep learning based grapheme-to-phoneme converter through a web API.

  • Participants: Alexandre Lafosse and Vincent Colotte

  • Contact: Vincent Colotte