[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-bugs-dist
Subject: [Bug 100107] backward searching a regexp does not work properly
From: Richard Smith <kde () metafoo ! co ! uk>
Date: 2005-02-26 13:42:05
Message-ID: 20050226134205.16256.qmail () ktown ! kde ! org
[Download RAW message or body]
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
http://bugs.kde.org/show_bug.cgi?id=100107
------- Additional Comments From kde metafoo co uk 2005-02-26 14:41 -------
On Friday 25 February 2005 14:33, Philippe Rigault wrote:
> QRegExp::searchRev() does not _have_ a bug, the function _is_
> itself a bug and should be avoided altogether (explaination below, sorry if
> it is a bit lengthy).
[...]
I do agree that Kate's behaviour is incorrect here, but your objection to
searchRev() doesn't seem right to me. Imagine I have the following regular
expression:
abcdef|[b-e]+
And I'm matching against the string:
foobarabcdefg
Now, where search() match this? Well, it matches at (in the order that the
string is found):
start:end (matching letters in capitals)
3:4 (fooBarabcdefg)
6:12 (foobarABCDEFg)
7:11 (foobaraBCDEfg)
8:11 (foobarabCDEfg)
9:11 (foobarabcDEfg)
10:11 (foobarabcdEfg)
Right? So, what can a reverse search for this regexp possibly do? If you want
it symmetric with search(), searchRev() *must* find 10:11 first, like it
currently does. Then it must find 9:11. Then 8:11. And so on. These are the
latest-starting greedy matches.
If you want the latest-finishing greedy matches, it'd have to match in this
order:
6:12 (foobarABCDEFg)
7:11 (foobaraBCDEfg)
7:10 (foobaraBCDefg)
7:9 (foobaraBCdefg)
7:8 (foobaraBcdefg)
3:4 (fooBarabcdefg)
which is not a reverse search: the starting point of the match is sometimes
moving *forwards*: it finds 'abcdef' then finds 'bcde'. And it's certainly
not symmetric with search(): it finds different matches!
How would you like reverse search with regexps to work at the Qt API level?
Now, how Kate deals with regexp searching is another matter. Using the same
regexp and search string, it matches:
3:4 (fooBarabcdefg)
6:12 (foobarABCDEFg)
and no other places. Reverse search should clearly find the same things in the
opposite order, so should match:
6:12 (foobarABCDEFg)
3:4 (fooBarabcdefg)
but actually matches:
10:11 (foobarabcdEfg)
6:12 (fooBarabcdefg)
Interestingly, in this case, the last-finishing greedy matches are exactly
what's wanted here: if we look for nonintersecting matches backwards using
last-finishing matching, we match at 6:12 then at 3:4, like we should (and
this works in general too).
Unfortunately, Qt doesn't provide a function to do this kind of searching,
which makes it non-trivial for Kate to implement. There is a way, however
(but it is nasty to implement): you can reverse the regexp (which is hard)
and reverse the search string (which is much easier):
Regexp: [b-e]+|fedcba
string: gfedcbaraboof
With these, Kate now matches:
gFEDCBAraboof
gfedcbaraBoof
which, when reversed back, give us the correct matches in the correct order.
This reversing is equivalent to doing a last-finishing greedy match. Clearly,
this isn't going to be implemented for KDE3.4, but does seem to be a valid
Kate bug.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic