From kde-core-devel Thu Sep 25 19:54:31 2003 From: Malte Starostik Date: Thu, 25 Sep 2003 19:54:31 +0000 To: kde-core-devel Subject: Re: Textfile classification (encoding, languages etc.) X-MARC-Message: https://marc.info/?l=kde-core-devel&m=106451979032292 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 25 September 2003 21:42, Zack Rusin wrote: > On Thursday 25 September 2003 15:06, Malte Starostik wrote: > > PS: any comments on making KSpell use libaspell or pspell instead of > > an external process if available? > > Oh, yeah, I'll be rewriting it once I'll get some more time. Laurent > wrote kospell which kind of does this but keeps the KSpell api and > makes creating new backends rather a pain. I like Enchant, but I'm > still not too keen on the Glib dependency. I like how instead of using > the ispell process they simply wrote it as a library and are using it. > We should do the same so that instead of using kprocess we use the > libraries directly. > So, we might meet on irc or start a discussion at some point and decide > whether we want to write a completely new implementation - we have > enough of use cases and after spending too much time with kspell and > other spell checkers I know what's needed so I'd vote for that. We can > also use Enchant. The problem with that is that we would have to write > our frontend to it anyway, which would pretty much end up with #1 but > witch Enchant as the only backend. > But anyway, what algorithm are you using to detect the languages? Is it > regexp based or is something more fun? You definitely got my full > attention. I didn't know Enchant, looks interesting, provided our frontend to the frontend would stay reasonably small. I've based the implementation on the Linuga::Ident perl module which uses tri- and bigrams. "Based on" means a bit more than a plain perl-C++ translation and a bit less than a complete rewrite. It's damn small but reliable. - -Malte -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/c0f6VDF3RdLzx4cRAgq4AJ923CAnhc2Yke13iUXdiEWXLrwtzwCghPRg lXMjryIthxJ3CQikmznFEyI= =g1BP -----END PGP SIGNATURE-----