[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    Re: Special chars in khtml
From:       Bryce Nesbitt <bryce () obviously ! com>
Date:       2001-11-09 17:41:08
[Download RAW message or body]

Shamus wrote:
> 
> >On Mit, 07 Nov 2001, Shamus wrote:
> >
> >> (left/right single/double quote, em/en dash) to a question mark? Is the
> >> problem in khtml, Qt, or X?
> >
> >The problem is your setup, fonts are not having the character you've
> >requested.
> 
> Hmm. I set up TrueType fonts on my XFree setup (4.1.0) that I *know* have
> those characters, and still no luck. It could be that the encoding of those
> fonts is screwing things up (after all, they came from my Win box). Can you
> point me to a font that has the correct encoding/characters to test with
> (the ones bundled with XFree obviously don't make the cut)? Would it be
> possible to distrubite such fonts with KDE 3?

I've been working on this issue.  There are a lot of players.
First try a test page:
	http://www.obviously.com/browsers/iso-8859-1_unicode.html
	http://www.obviously.com/browsers/windows-1252.html

The basic problem is that you're working with characters that are usually
encoded illegally.  They originate from Microsoft boxes, and if used at
all, should be labeled "charset=windows-1252".

khtml has a kludge to notice some of these characters and subtitute them
with ASCII:
                case 0x82: (x) = ','; break; \
                case 0x84: (x) = '"'; break; \
                case 0x8b: (x) = '<'; break; \
                case 0x9b: (x) = '>'; break; \
                case 0x91: (x) = '\''; break; \
                case 0x92: (x) = '\''; break; \
                case 0x93: (x) = '"'; break; \
                case 0x94: (x) = '"'; break; \
                case 0x95: (x) = '*'; break; \
                case 0x96: (x) = '-'; break; \
                case 0x97: (x) = '-'; break; \
                case 0x98: (x) = '~'; break; \
                case 0xb7: (x) = '*'; break; \

khtml also messes with unicode, in a way I'm sure is a bad idea.  I
added the missing left quote, but think the whole approach is broken:
                case 0x2013: (x) = '-'; break; \
                case 0x2014: (x) = '-'; break; \
                case 0x2018: (x) = '\''; break; \
                case 0x2019: (x) = '\''; break; \
                case 0x201c: (x) = '"'; break; \
                case 0x201d: (x) = '"'; break; \

If the characters end up getting converted to Unicode or start as Unicode,
QT is supposed to fake up some symbols to match.  I was able to fix &euro;
quite easily.  The others, especially TM, are tricker.  Try kcharselect to
see what's up.  All the stuff of interest is on page 32.

If you have a REAL Unicode font (ClearlyU, Microsoft Arial Unicode MS)
then you should see the characters directly.  If you don't it's your
X font server's fault.

Confused yet?

		-Bryce
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic