Shivam, This is a very interesting project. Could you go into a bit of detail about the technical aspect of this? Another developer is working on using libtextcat to detect language and change the language of the kde text-to-speech system (Jovie) based on the detected language. It sounds to me like there could be some overlap between what he's doing and what you are doing also. thanks, Jeremy On Mon, Nov 25, 2013 at 3:53 PM, Christoph Feck wrote: > On Monday 25 November 2013 23:32:15 Shivam Makkar wrote: >> [...] >> So, I request you to upload as many articles as you can in various >> languages (or at least one in your native language) so that it can >> be detected by the algorithm. > > Many NLP researchers simply use Wikipedia text. Regarding topic > coverage, peer-reviewed grammar and spelling, you will have a hard > time to beat it. You can find the raw XML as .bz2 downloads at the > Wikipedia sites. Stripping the XML/Wiki formatting away and leaving > only the text is a simple task for any Perl script coder. > > Christoph Feck (kdepepo) > >>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe << _______________________________________________ KDE PIM mailing list kde-pim@kde.org https://mail.kde.org/mailman/listinfo/kde-pim KDE PIM home page at http://pim.kde.org/