[prev in list] [next in list] [prev in thread] [next in thread] 

List:       festlang-talk
Subject:    [festival-talk] Telugu Festival Text-to-Speech System 0.3	Released
From:       b.williams () bangor ! ac ! uk (Briony Williams)
Date:       2006-02-17 15:15:15
Message-ID: 43F5E883.8000208 () bangor ! ac ! uk
[Download RAW message or body]

message from Briony Williams <b.williams at bangor.ac.uk> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Chaitanya Kamisetty wrote:
> festival does not have native support to handle Unicode strings. Which
> means that when given a utf8 encoded string, some string handling
> functions in siod interpreter like symbolexplode, length etc will return
> incorrect values. But this does not really impede us from using utf8
> strings which are valid C strings. This means, functions for string
> comparision, regex matching will work correctly. The major challenge for
> us was really in writing the letter-to-sound (lts) rules. Here we had to
> split a 3 byte telugu unicode character into its components bytes and
> write the rules. You could be doing the same instead of converting into
> transliterated form.

In developing a Festival-based diphone synthesiser for Welsh, we (at Bangor, 
in Wales) had a similar problem in getting Festival to handle UTF-8 input. In 
particular, Welsh has w-circumflex and y-circumflex.  So a separate C program 
was written, and a hook used within Festival to call that program before 
anything else takes place. Any non-7-bit characters are converted into a 
7-bit transliteration, and this is the format handled by the lexicon and 
letter-to-sound rules.  This solution seems to work very well, and can easily 
be adapted to other symbols used by other languages.

>>We have come-up
>>with a rule-based algorithm for Syllabification. We hope to develop and
>>integrate these modules as Scheme codes(or we may have to override Festival's
>> methods by writing them in C). We believe studying how you have handled such
>>cases in your Telugu TTS, is very useful. 
> 
> We are yet to work on proper syllabification and pos tagging.

I have been working on rules for carrying out Welsh syllabification, since 
the current Festival method ("lex.syllabify.phstress") does not give very 
good results for Welsh.  We are considering how to integrate these rules into 
Festival, for use both at run-time (whenever the LTS rules are called) and 
also during development (when a new lexicon is compiled). We would be 
interested in learning of any similar work done by others.

>>We hope to implement TOBI(CART) for Intonation and Duration. But, still in the
>>literature survey. So, you might be able to give some step by step guidelines
>>on how to do that. ie. selection of text to build speech corpus, labeling
>>process, format the labels as required by festival, training with wagon etc..

We have built a CART tree for segmental durations using a hand-labelled Welsh 
speech database. The main work is in labelling the database, and extracting 
the data in a format suitable to be used for training a CART.  The actual 
CART training is quite fast.

>>Perhaps, you might have already developed components which we will be able to
>>re-use. Specially, the Unicode Tokenizer, LTS/Lexicon module etc. We will be
>>very much thankful to you if you could provide us any documents, comments,
>>suggestions which will be useful in developing our TTS too.

Our UTF-tokeniser code is available to anyone who is interested - see
http://bedwyr-redhat.bangor.ac.uk/svn/repos/WISPR/Software/Festival/WISPR/Patch/Trunk/
(or, for the merged version, see
http://bedwyr-redhat.bangor.ac.uk/svn/repos/WISPR/Software/Festival/WISPR/Merged/Trunk/festival/

Please get in touch if you decide to use it, and let us know what you're 
doing with it.

> Most Indian languages are phonetic in nature. They do not require a
> lexical lookup to convert words to phones. 

The same is true of Welsh. However, a lexicon is needed for POS information, 
and we have been training a phrase break algorithm that uses POS information 
(using an existing POS-tagged text corpus).

> Once we have a TTS ready, I wrote something about how to festival TTS
> with assistive technologies like Gnopernicus on localized desktops to
> read out the text. You can have a look at it at
> http://telugu.sarovar.org/wiki/index.php/SpeechAssistance

Thank you - that's very interesting.

Our team have manage to slim down Festival enough to create a version of a 
Welsh diphone voice that will run under Windows under MSAPI - so it can be 
called on by any Windows application that uses the MSAPI engine (e.g. word 
processors, web browsers).

Best regards

Briony Williams
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
=    University of Edinburgh's Festival Speech Synthesis System       =
= http://festvox.org/festival      Sent Via festival-talk at festvox.org =
=                           To unsubscribe mail majordomo at festvox.org =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic