Hello Jessica, nice to meet you. On Thursday, March 20, 2014 01:51:09 PM Jessica Horst wrote: > My colleague told me that speech recognition software works by having a > threshold of similarity. For example, when I tell my mobile phone =B3call > home=B2 the software compares what I said to what I have said before and = if it > is similar enough (above threshold) it will recognise my speech. I=B9m > hopeful that I could use the same kind of principle here (how similar is > the child=B9s speech to the adult speech (what was said before), but I wo= uld > want a numerical value instead of just knowing if it was above or below > threshold. I am sorry to say but you have been slightly misinformed. In practice the = process is slightly different. (Disclaimer: the following explanations contains a few simplifications) The decoding produces the most likely path through the space of alternative= s = (allowed sentences, if you are doing grammar based decoding). The question = answered by the decoding is: Given the observations (recording), which of t= he = possibilities (sentences) is the most likely? To determine the most likely candidate, there is an internal scoring proces= s = but these scores are entirely relative to each other and not compared to a = fixed threshold. Most decoders implement some form of confidence scoring, = telling you how confident the system is in it's results, but these scores w= ill = likely not be what you want because differences that appear substantial to = the = human ear will not necessarily have a big impact on the confidence score an= d = the other way around. = Depending on your use case a dedicated classifier will probably yield bette= r = results. What exactly do you want to do? Best regards, Peter _______________________________________________ kde-accessibility mailing list kde-accessibility@kde.org https://mail.kde.org/mailman/listinfo/kde-accessibility