'Re: patch for #776 - khtmlw bug fixed'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: patch for #776 - khtmlw bug fixed
From:       Lars Knoll <knoll () mpi-hd ! mpg ! de>
Date:       1999-03-21 12:40:41
[Download RAW message or body]

On Fri, 19 Mar 1999, David Faure wrote:
>Below a bug report, and the patch I suggest for it.
>Please give your opinion on it.
>
>----- Forwarded message from Rich Wohlstadter <root@mvp.net> -----
>
>Subject: Bug#776: kfm parses url which contains cgi variables incorrectly
>Date: 	Sat, 20 Feb 1999 00:04:59 -0600 (CST)
>From: Rich Wohlstadter <root@mvp.net>
>To: submit@bugs.kde.org
>
>Package: kfm
>Version: 1.167
>
>I noticed that sometimes kfm will parse an url incorrectly when it
>contains a variable list to be passed on to a cgi script.  For example:
>
><A HREF=http://www.yahoo.com/index?q=kfm&start=10&num=10&sa=N>test</A>
>
>This will get parsed incorrectly as:
>
>http://www.yahoo.com/index?q=kfm&start=10n=10&sa=N
>                                         ^
>
>Another one I saw was this:
>
>http://chooser.mp3.com/cgi-bin/mp3.cgi?path=/mp3/cowboy.mp3&genre=Country&id=38
>
>gets munged into:
>
>http://chooser.mp3.com/cgi-bin/mp3.cgi?path=/mp3/cowboy.mp33=Country&id=38    
>                                                           ^
>                                                           |
>                                                actually a cubed symbol  
>
>Rich Wohlstadter
>A.G. Edwards & Sons
>
>----- End forwarded message -----
>
>
>Index: htmltoken.cpp
>===================================================================
>RCS file: /home/kde/kdelibs/khtmlw/Attic/htmltoken.cpp,v
>retrieving revision 1.47
>diff -u -p -b -r1.47 htmltoken.cpp
>--- htmltoken.cpp       1999/01/05 16:17:20     1.47
>+++ htmltoken.cpp       1999/03/18 22:17:41
>@@ -354,17 +354,19 @@ void HTMLTokenizer::write( const char *s
>            }
>            else
>            {
>-               // Check for &abc12 sequence
>+               // Check for &abc12; sequence
>                if (!isalnum(*src))
>                {
>                     int len;
>                    charEntity = false;
>+                  if (searchBuffer[searchCount+1] == ';') {
>                    searchBuffer[ searchCount+1] = '\0';
>                    res = charsets->convertTag(searchBuffer+1, len).copy();
>                    if (len <= 0)
>                    {
>                        res = 0;
>                    }
>+                  }
>                }
>            } 
>
>
>I added the check for the trailing ';' that one can expect in an entity.
>The ';' was assumed, before, hence the problem with such URLs as the ones in the
>bug report, because &nu and &ge are known entities.
>
>Are there HTML pages that don't provide the trailing ';' for entities ? The 
>patch would break the displaying of such entities, but as it's bad HTML anyway...

Hi David,

there are some pages not using the trailing ';', so it'll probably "break some
broken pages"..., but I think we should apply it anyway. 
First of all, it's better to have one char entity displayed wrongly than a
link, that doesn't work, and second of all, if web page authors are using
broken html, it's moslty their own fault, if the page is not displayed
correctly. We can't account for every bad habit people have, when composing web
pages...

Cheers,
Lars

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic