[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: Sites with Illegal character entities & khtmlw
From:       Andreas Pour <pour () mieterra ! com>
Date:       1999-03-27 16:41:25
[Download RAW message or body]

Dawit Alemayehu wrote:

> Greetings,
>
> There is one annoying problem with the way kfm handles incorrect character
> entities.  Currently on my pc it displays small black rectangular boxes that
> distract me from reading the page.  To see an example of this go to
> http://www.msnbc.com/news/253207.asp.  If you view the source for the document,
> they use &#146; &#147; &#148; all over the place.  However, none of these
> character entities are valid for neither HTML 3.2 nor HTML 4.0 specs. I truly
> fail to see why they use these characters at all.  Does anyone know ?  Perhaps
> these characters are allowed in M$ Active Server pages, but they sure are not
> valid standard character entities.
>
> My question then is, would it be wise to add a check in htmltoken.cpp that will
> check for the range of non-printable and convert them to spaces or simply
> ignore them ?  I particulary see the &#146; on many many sites being used to
> incorrectly repesent the apostrophe ( ' ) character.

Whether or not it is "incorrect" depends quite heavily on the assumption that some
committee standard is more important than reality.  Reality is that both Netscape
and IE support the Netscape ISO extensions for quotes and dashes, and, if 95% or
more of people can see them, why is that not a standard?  Esp. when it is so easy
to comply with this, why would you convert them to spaces instead of what the web
page author intended them to be?

Regards,

Andreas Pour

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic