[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-i18n
Subject:    Sites with Illegal character entities & khtmlw
From:       Dawit Alemayehu <adawit () earthlink ! net>
Date:       1999-03-27 6:02:15
[Download RAW message or body]

Greetings,

There is one annoying problem with the way kfm handles incorrect character
entities.  Currently on my pc it displays small black rectangular boxes that
distract me from reading the page.  To see an example of this go to 
http://www.msnbc.com/news/253207.asp.  If you view the source for the document,
they use &#146; &#147; &#148; all over the place.  However, none of these
character entities are valid for neither HTML 3.2 nor HTML 4.0 specs. I truly
fail to see why they use these characters at all.  Does anyone know ?  Perhaps
these characters are allowed in M$ Active Server pages, but they sure are not
valid standard character entities.

My question then is, would it be wise to add a check in htmltoken.cpp that will
check for the range of non-printable and convert them to spaces or simply
ignore them ?  I particulary see the &#146; on many many sites being used to
incorrectly repesent the apostrophe ( ' ) character. 

Cheers,
Dawit A.

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic