[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-commits
Subject:    kdenonbeta/kttsd/kttsd
From:       Gary Cramblitt <garycramblitt () comcast ! net>
Date:       2004-08-24 2:38:47
Message-ID: 20040824023847.978756DA9 () office ! kde ! org
[Download RAW message or body]

CVS commit by cramblitt: 

Better sentence parsing.


  M +13 -7     speechdata.cpp   1.19


--- kdenonbeta/kttsd/kttsd/speechdata.cpp  #1.18:1.19
@@ -251,15 +251,21 @@ QStringList SpeechData::parseText(const 
         sentenceDelimiter = QRegExp(sentenceDelimiters[appId]);
     else
-        sentenceDelimiter = QRegExp("([\\.\\?\\!\\:\\;])\\s");
+        sentenceDelimiter = QRegExp("([\\.\\?\\!\\:\\;]\\s)|(\\n *\\n)");
     QString temp = text;
-    // Replace sentence delimiters with double newline.
-    temp.replace(sentenceDelimiter, "\\1\n\n");
+    // Replace spaces, tabs, and formfeeds with a single space.
+    temp.replace(QRegExp("[ \\t\\f]+"), " ");
+    // Replace sentence delimiters with tab.
+    temp.replace(sentenceDelimiter, "\\1\t");
+    // Replace remaining newlines with spaces.
+    temp.replace("\n"," ");
+    temp.replace("\r"," ");
     // Remove leading spaces.
-    temp.replace(QRegExp("\\n[ \\t]+"), "\n");
+    temp.replace(QRegExp("\\t +"), "\t");
     // Remove trailing spaces.
-    temp.replace(QRegExp("[ \\t]+\\n"), "\n");
+    temp.replace(QRegExp(" +\\t"), "\t");
     // Remove blank lines.
-    temp.replace(QRegExp("\n\n\n+"),"\n\n");
-    QStringList tempList = QStringList::split("\n\n", temp, false);
+    temp.replace(QRegExp("\t\t+"),"\t");
+    // Split into sentences.
+    QStringList tempList = QStringList::split("\t", temp, false);
 /*
     // This should be something better, like "[a-zA-Z]\. " (a regexp of course) The \
dot (.) is used for more than ending a sentence.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic