[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: Re: Mixing encodings with an HTML page
From: Lars Knoll <lars () trolltech ! com>
Date: 2001-03-07 10:31:53
[Download RAW message or body]
On Wednesday 07 March 2001 11:18, Brunet Eric wrote:
> Hello all,
>
> I have already asked this question on this mailing list a couple of weeks
> ago and got no answer. Of course, this was just during the final freeze
> of kde 2.1, and everybody was busy fixing the few remaining bugs. Now I
> think that people have more time to discuss about future improvments of
> konqueror...
>
> My problem is the following: suppose I have an HTML file which looks like
> that:
>
> ---------------------------------------------------------------------------
>- <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-7" />
> </head>
>
> <body>
>
> Ù<p> <!-- character 0xd7; uppercase omega in latin-7 encoding -->
>
> é<p> <!-- this one is not in latin 7 --!>
>
> Д <!-- U+0414; CYRILLIC CAPITAL LETTER DE. Not in latin-7 -->
>
> </body>
> ---------------------------------------------------------------------------
>
> This is I believe a perfectly valid html file, but as far as I can tell,
> there is no way to have konqueror display it properly. There should be
> three lines, an uppercase omega (greek), a small e with acute (western
> europe) and an uppercase de (russian). If I let the encoding to auto in
> konqueror, the omega is correct and I have then two question marks. If I
> choose a latin-1 encoding, then I have the small e with acute, but the
> omega looks like a capital u with grave and the de like a question mark.
> Finally, if I choose an utf-8 encoding, then both the small e with acute
> and the capital de are correct, but the omega is not there. (And it is
> even worse than that: while trying to interpret the 0xd7 as a multi-byte
> sequence, the parser ``ate'' the <, and the result looks like
> [weird character]p>é...)
>
> So it looks that konqueror is not able to display a page by using
> characters from different fonts with different encodings.
>
> Is there any chance that in a near future, the best browser in the world
> would be able to handle such pages ?
Unfortunately, the X11 font cencept makes this exceedingly difficult to
implement. There are a few ways to get this working. One is too use Unicode
fonts for displaying. I removed this in KDE-2/2.1 because it made quite some
problems for people with slower machines (and most people don't need the
mixing). I could readd this as a config option to the HTML settings dialog in
2.2. It'll work directly if you use the new antialiased fonts with Qt-2.3,
beacause these are always Unicode fonts, and Qt just pretends them to be
something different.
The real solution will however only come with Qt-3 where we get a real good
abstraction of a font, that hides all the uglyness (8bit'ness) of the X11
font model.
Regards,
Lars
>> Visit http://master.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic