On January 23, 2004 4:20 pm, Martijn Klingens wrote: > I want first and foremost to have accurate and autodetected conversion. This is impossible :P > The user's setting should be TRIED first, but not FORCED. If it is broken > utf8 we know it will break the parser, it makes no sense to obey the user > at all. So you mean, if the user chose UTF8 then we check isUTF8, and if it is not, then replace with ? characters wherever needed? I would go for that I guess. > - make sure that whenever Utf8 is being used isUtf8() is called first and > if it fails forget about using Utf8 No. See, this is the problem. You are assuming that you should try UTF then if UTF fails then you'll be able to guess something. This is backwards. UTF is the only codec that gives no failure, also it's the only one we have to scan over *twice (isUTF8() and then conversion ) so its the most expensive. And on top of all this, hardly anyone uses it. So it's most error prone, most expensive, and no one uses it. It *definitly* should be the last check. *UNLESS* the user chose it. If the user chose UTF, then attempt isUTF8, if that fails, then *maybe* try latin1, if that fails, just clean up wherever possible. There's no point trying local8bit, it's bound to fail. > Not really. Generally contact lists tend to consist of people from mostly > the same country. Eh huh? Not from my experience... I have people from here, from Europe, from Asia. Anyways, contact lists don't really have much to do with it, especially on IRC. Anyone could message you from anywhere out of the blue. My new proposed ordering in pseudo code: if( userCodec == QTextCodec::codecForName("utf") ) { if( isUTF8( string ) ) return tryCodec->decode( string ) else { try QTextCodec::codecForName("latin1")->decode( string ) if( success ) { return } else { return cleanString( string ); } } } else { if( userCodec && tryCodec->decode( string ) return; else { try QTextCodec::codecForName("latin1")->decode( string ) if( success ) { return } else { return cleanString( string ); } } } .. where cleanString strips all non-UTF-8 decodable characters from the string somehow. -- There's no place like 127.0.0.1 http://www.keirstead.org _______________________________________________ Kopete-devel mailing list Kopete-devel@kde.org https://mail.kde.org/mailman/listinfo/kopete-devel