[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: using unicode in khtml
From:       Nicolas Brodu <brodu () iie ! cnam ! fr>
Date:       1999-05-17 13:46:34
[Download RAW message or body]

On Mon, 17 May 1999, Waldo Bastian wrote :
>
>> The <body> tag can be completely missing.
>> I decided now to make a new (and small) class dealing with the input
>> stream khtml gets. The class will be called from the tokenizer, and do the
>> transformation to unicode. This seems to work already.
>
>How does it find out which charset should be used? 
>

Just an idea, (might have already been discussed, I'm jumping in this thread) :

How about having a small library of characters specific to (or most frequently
met in) the various charsets. Then we can check a few lines of the document,
or at least the title. This, plus a default choice based on the domain
extension (.fr, .de, ...) in case of multiple choices, should give the good set
most of the time.


Cheers,

--------------------------------
Nicolas Brodu, brodu@iie.cnam.fr
Eleve-Ingenieur 2eme annee (Institut d'Informatique d'Entreprise)
  http://www-eleves.iie.cnam.fr/brodu (smblib)
KDE developer, brodu@kde.org
  http://www.kde.org, Color outside the lines !

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic