[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: Re: Special chars in khtml
From: Bryce Nesbitt <bryce () obviously ! com>
Date: 2001-11-09 17:41:08
[Download RAW message or body]
Shamus wrote:
>
> >On Mit, 07 Nov 2001, Shamus wrote:
> >
> >> (left/right single/double quote, em/en dash) to a question mark? Is the
> >> problem in khtml, Qt, or X?
> >
> >The problem is your setup, fonts are not having the character you've
> >requested.
>
> Hmm. I set up TrueType fonts on my XFree setup (4.1.0) that I *know* have
> those characters, and still no luck. It could be that the encoding of those
> fonts is screwing things up (after all, they came from my Win box). Can you
> point me to a font that has the correct encoding/characters to test with
> (the ones bundled with XFree obviously don't make the cut)? Would it be
> possible to distrubite such fonts with KDE 3?
I've been working on this issue. There are a lot of players.
First try a test page:
http://www.obviously.com/browsers/iso-8859-1_unicode.html
http://www.obviously.com/browsers/windows-1252.html
The basic problem is that you're working with characters that are usually
encoded illegally. They originate from Microsoft boxes, and if used at
all, should be labeled "charset=windows-1252".
khtml has a kludge to notice some of these characters and subtitute them
with ASCII:
case 0x82: (x) = ','; break; \
case 0x84: (x) = '"'; break; \
case 0x8b: (x) = '<'; break; \
case 0x9b: (x) = '>'; break; \
case 0x91: (x) = '\''; break; \
case 0x92: (x) = '\''; break; \
case 0x93: (x) = '"'; break; \
case 0x94: (x) = '"'; break; \
case 0x95: (x) = '*'; break; \
case 0x96: (x) = '-'; break; \
case 0x97: (x) = '-'; break; \
case 0x98: (x) = '~'; break; \
case 0xb7: (x) = '*'; break; \
khtml also messes with unicode, in a way I'm sure is a bad idea. I
added the missing left quote, but think the whole approach is broken:
case 0x2013: (x) = '-'; break; \
case 0x2014: (x) = '-'; break; \
case 0x2018: (x) = '\''; break; \
case 0x2019: (x) = '\''; break; \
case 0x201c: (x) = '"'; break; \
case 0x201d: (x) = '"'; break; \
If the characters end up getting converted to Unicode or start as Unicode,
QT is supposed to fake up some symbols to match. I was able to fix €
quite easily. The others, especially TM, are tricker. Try kcharselect to
see what's up. All the stuff of interest is on page 32.
If you have a REAL Unicode font (ClearlyU, Microsoft Arial Unicode MS)
then you should see the characters directly. If you don't it's your
X font server's fault.
Confused yet?
-Bryce
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic