[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [Bug 100107] backward searching a regexp does not work properly
From:       Richard Smith <kde () metafoo ! co ! uk>
Date:       2005-02-26 13:42:05
Message-ID: 20050226134205.16256.qmail () ktown ! kde ! org
[Download RAW message or body]

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
         
http://bugs.kde.org/show_bug.cgi?id=100107         




------- Additional Comments From kde metafoo co uk  2005-02-26 14:41 -------
On Friday 25 February 2005 14:33, Philippe Rigault wrote:
> QRegExp::searchRev() does not _have_ a bug, the function _is_
> itself a bug and should be avoided altogether (explaination below, sorry if
> it is a bit lengthy).
[...]

I do agree that Kate's behaviour is incorrect here, but your objection to 
searchRev() doesn't seem right to me. Imagine I have the following regular 
expression:

abcdef|[b-e]+

And I'm matching against the string:

foobarabcdefg

Now, where search() match this? Well, it matches at (in the order that the 
string is found):

start:end (matching letters in capitals)
3:4 (fooBarabcdefg)
6:12 (foobarABCDEFg)
7:11 (foobaraBCDEfg)
8:11 (foobarabCDEfg)
9:11 (foobarabcDEfg)
10:11 (foobarabcdEfg)

Right? So, what can a reverse search for this regexp possibly do? If you want 
it symmetric with search(), searchRev() *must* find 10:11 first, like it 
currently does. Then it must find 9:11. Then 8:11. And so on. These are the 
latest-starting greedy matches.

If you want the latest-finishing greedy matches, it'd have to match in this 
order:

6:12 (foobarABCDEFg)
7:11 (foobaraBCDEfg)
7:10 (foobaraBCDefg)
7:9 (foobaraBCdefg)
7:8 (foobaraBcdefg)
3:4 (fooBarabcdefg)

which is not a reverse search: the starting point of the match is sometimes 
moving *forwards*: it finds 'abcdef' then finds 'bcde'. And it's certainly 
not symmetric with search(): it finds different matches!

How would you like reverse search with regexps to work at the Qt API level?

Now, how Kate deals with regexp searching is another matter. Using the same 
regexp and search string, it matches:

3:4 (fooBarabcdefg)
6:12 (foobarABCDEFg)

and no other places. Reverse search should clearly find the same things in the 
opposite order, so should match:

6:12 (foobarABCDEFg)
3:4 (fooBarabcdefg)

but actually matches:

10:11 (foobarabcdEfg)
6:12 (fooBarabcdefg)

Interestingly, in this case, the last-finishing greedy matches are exactly 
what's wanted here: if we look for nonintersecting matches backwards using 
last-finishing matching, we match at 6:12 then at 3:4, like we should (and 
this works in general too).

Unfortunately, Qt doesn't provide a function to do this kind of searching, 
which makes it non-trivial for Kate to implement. There is a way, however 
(but it is nasty to implement): you can reverse the regexp (which is hard) 
and reverse the search string (which is much easier):

Regexp: [b-e]+|fedcba
string: gfedcbaraboof

With these, Kate now matches:
gFEDCBAraboof
gfedcbaraBoof

which, when reversed back, give us the correct matches in the correct order. 
This reversing is equivalent to doing a last-finishing greedy match. Clearly, 
this isn't going to be implemented for KDE3.4, but does seem to be a valid 
Kate bug.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic