Hi Folks:

I agree with Will.  Although I used Gary's recent example of writing a
speech-to-text service to discuss how KDE development of ATs could
interoperate with AT-SPI, it's probably not a great candidate for
writing a new cross-platform API.  It makes more sense to do that with
modalities/services for which we already have more than one working
example.

Bill

On Thu, 2006-02-23 at 16:57, Willie Walker wrote:
> Hi All:
> 
> I just want to jump in on the speech recognition stuff.  Having  
> participated in several standards efforts (e.g., JSPAI, VoiceXML/SSML/ 
> SGML) in this area, and having developed a number of speech  
> recognition applications, and having seen the trials and tribulations  
> of inconsistent SAPI implementations, and having led the Sphinx-4  
> effort, I'd like to offer my unsolicited opinion :-).
> 
> In my opinion, there are enough differences in the various speech  
> recognition systems and their APIs that I'm not sure efforts are best  
> spent charging at the "one API for all" windmill.  IMO, one could  
> spend years trying to come up with yet another standard but not very  
> useful API in this space.  All we'd have in the end would be yet  
> another standard but not very useful API with perhaps one buggy  
> implementation on one speech engine.  Plus, it would just be  
> repeating work and making the same mistakes that have already been  
> done time and time again.
> 
> As an alternative, I'd offer the approach of centering an available  
> recognition engine and designing the assistive technology first.  Get  
> your feet wet with that and use it as a vehicle to better understand  
> the problems you will face with any speech recognition task for the  
> desktop.  Examples include:
> 
> o how to dynamically build a grammar based upon stuff you can get  
> from the AT-SPI
> o how to deal with confusable words (or discover that recognition for  
> a particular grammar is just plain failing and you need to tweak it  
> dynamically)
> o how to deal with unspeakable words
> o how to deal with deictic references
> o how to deal with compound utterances
> o how to handle dictation vs. command and control
> o how to deal with tapering/restructuring of prompts based upon  
> recognition success/failure
> o how to allow the user to recover from misrecognitions
> o how to handle custom profiles per user
> o (MOST IMPORTANTLY) just what is a compelling speech interaction  
> experience for the desktop?
> 
> Once you have a better understanding of the real problems and have  
> developed a working assistive technology, then take a look at perhaps  
> genericizing a useful layer to multiple engines.  The end result is  
> that you will probably end up with a useful assistive technology  
> sooner.  In addition, you will also end up with an API that is known  
> to work for at least one assistive technology.
> 
> Will
> 
> _______________________________________________
> kde-accessibility mailing list
> kde-accessibility@kde.org
> https://mail.kde.org/mailman/listinfo/kde-accessibility

_______________________________________________
kde-accessibility mailing list
kde-accessibility@kde.org
https://mail.kde.org/mailman/listinfo/kde-accessibility