From kde-commits  Tue Oct 19 16:01:00 2004
From: Andrew Coles <andrew_coles () yahoo ! co ! uk>
Date: Tue, 19 Oct 2004 16:01:00 +0000
To: kde-commits
Subject: Re: kdelibs/khtml/xml
Message-Id: <200410191701.00615.andrew_coles () yahoo ! co ! uk>
X-MARC-Message: https://marc.info/?l=kde-commits&m=109820170232247

On Tuesday 19 Oct 2004 16:31, you wrote:
> On Tuesday 19 October 2004 17:13, Andrew Coles wrote:
> > - do not assume the size of a QChar is 1 byte
>
> it doesn't assume that. It assumes that the size is sizeof(QChar). its not
> a pointer to bytes, its a pointer to QChar's. You change is incorrect I
> think.

Whoops, a small test case shows you're right, my bad - I mis-took how the 
pointer arithmetic worked in that case.

> > - attempting to run a single '%' character through parseLength resulting
> >   in accessing invalid memory
>
> Did you verify that? We purposefully store an extra NUL character at the
> end of each string as a sentinel.

Yes, I've been running malformed HTML through Konqueror running under valgrind 
(see the recent article on slashdot).

Putting a single % into parseLength caused problems: as the last character is 
a % the first if was explored, which doesn't do a great deal in this case 
until last--, l--.  Once last-- has been done, last points to before the 
start of the string, so the later calls (such as 'if (*last == '*')') were 
accessing memory they shouldn't have been doing.  So if there was a NUL 
sentinel at the start of the string that might have fixed it, but one at the 
end wouldn't matter in this case.

I've corrected parseLength after my gaff now: it's as it was before I changed 
it, except with the extra two lines in the middle:

if (l == 0)
        return Length(0, Variable);

which deals with the single % character case.

> > - toLengthArray made more robust to erroneous input:
>
> more robust to which kind of input? You just made it like 2 times slower.
> This code is not there to be nice, its there to be fast as well.

The input was %n%n%n%n%n%n.  It seems, however, that using the old code means 
that the errors there go away when the single % character case is handled in 
parseLength.

So to summarise: code is as before, but now checks for the single % character 
special case in parseLength.

Sorry for taking up your time,

Andrew