[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [Bug 100107] backward searching a regexp does not work properly
From:       Philippe Rigault <prigault () oricom ! ca>
Date:       2005-02-25 14:33:57
Message-ID: 20050225143357.32371.qmail () ktown ! kde ! org
[Download RAW message or body]

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
         
http://bugs.kde.org/show_bug.cgi?id=100107         




------- Additional Comments From prigault oricom ca  2005-02-25 15:33 -------
QRegExp::searchRev() does not _have_ a bug, the function _is_ itself a bug and should \
be avoided altogether (explaination below, sorry if it is a bit lengthy).

QRegExp::searchRev() confuses a lot of people, who wrongly think that is searches a \
regexp backwards from the end of a string, and therefore is symmetrical to \
QRegExp::search().

This is *NOT* what QRegExp::searchRev() does.

What it does is both complicated and counterintuitive, two reasons making it worse \
than just useless.

Here is how I learned exactly how QRegExp::searchRev() works. I filed a bug to \
Trolltech about this in December ([Issue N62377]) because of another bug: \
http://bugs.kde.org/show_bug.cgi?id=94726 

> Hello,
> 
> Qt version: qt-3.3.3
> Compiled from sources.
> Plarform: Linux, i686
> Compiler: gcc-3.4.2 (Fedora Core 3) and gcc-3.3.3 (Fedora Core 1)
> 
> I think that QRegExp::searchRev() has a bug, demonstrated by the following
> code:
> 
> #include <iostream>
> #include <qregexp.h>
> #include <qstring.h>
> 
> using namespace std;
> 
> int main (int argc, char *argv[]) {
> QString name = "foo123.png";
> QRegExp rx("\\d+");
> 
> int i = rx.search(name);
> int j = rx.matchedLength();
> 
> cout << "search i=" << i << " j=" << j << endl;
> 
> i = rx.searchRev(name);
> j = rx.matchedLength();
> 
> cout << "searchRev i=" << i << " j=" << j << endl;
> 
> }
> 
> $ ./test_regexp
> search i=3 j=3
> searchRev i=5 j=1
> 
> It shows that the QRegExp::searchRev() function matches only the last digit
> of the regexp ("3") instead of the whole thing ("123").
> QRegExp::search() behaves correctly (i.e matches "123").
> 

And here is their answer:

> Hi Philippe,
> 
> On Sunday, 12. Dec 2004 00:06 Philippe Rigault wrote:
> 
> [snip]
> 
> > It shows that the QRegExp::searchRev() function matches only the last
> > digit of  the regexp ("3") instead of the whole thing ("123").
> > QRegExp::search() behaves correctly (i.e matches "123").
> 
> This is actually by design, QRegExp::searchRev() will still search
> forward, but it will just start at the end, and work it's way
> backwards.
> 
> Hope this helps.
> 
> --
> Jan Erik Hanssen
> Trolltech AS, Waldemar Thranes gate 98, NO-0175 Oslo, Norway

You can now figure out what QRegExp::searchRev() does in plain english:
Given a string, it searches _forward_ for a regexp in successive substrings \
consisting first of the last character of the string, then the last two characters, \
and so on. Contrast it with QRegExp::search(), which also searches forward for a \
regexp in successive substrings, but these substrings consist first of the entire \
string, then the substring starting at the second character, and so on.

This design of QRegExp::searchRev(), in addition to being counterintuitive, poses a \
major problem with regards to greediness, as I objected to Trolltech:

> Hi Jan,
> 
> Thank you for your rapid response.
> 
> This renders searchRev() directly incompatible with the very notion of
> greediness in regular expressions with quantifiers.
> What people expect is that:
> - with setMinimal(FALSE), the search is greedy
> - with setMinimal(TRUE), the search is notgreedy
> 
> Your definition of QRegExp::searchRev() _guarantees_ that the search can
> never be greedy, and that setMinimal() has no effect on
> QRegExp::searchRev(), which is unlike any other function and absolutely
> counterintuitive.
> 
> If this is really the case (and then QRegExp::searchRev() would be a pretty
> useless function because it cannot use quantifiers properly), this needs to
> be _heavily_ documented.
> 
> Now I think that the design part is the bug.
> 
> Best regards,

My opinion is therefore that QRegExp::searchRev() should always be avoided:
  - because in most cases, a proper use of QRegExp::search() can do the job (for \
example, if one really wants the symmetrical action of QRegExp::search(), then use it \
                on the reversed string)
  - because it is incompatible with greediness, as demonstrated above
  - because each inclusion of this function in the code would require _heavy_ warning \
comments to remind people that this is not a straightforward function.

Best regards,

Philippe Rigault


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic