[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [Baloo] [Bug 339908] New: baloo_file_extractor ignoring files it should not ignore because of regexp
From:       Dominik Cermak <d.cermak () arcor ! de>
Date:       2014-10-12 18:10:13
Message-ID: bug-339908-17878 () http ! bugs ! kde ! org/
[Download RAW message or body]

https://bugs.kde.org/show_bug.cgi?id=339908

            Bug ID: 339908
           Summary: baloo_file_extractor ignoring files it should not
                    ignore because of regexp
           Product: Baloo
           Version: 5.0.1
          Platform: Compiled Sources
                OS: Linux
            Status: UNCONFIRMED
          Severity: major
          Priority: NOR
         Component: Files
          Assignee: me@vhanda.in
          Reporter: d.cermak@arcor.de

I discovered that I couldn't find some of my videos with baloo because they
weren't indexed.
I went ahead and issued a "baloo_file_extractor" on one of the files in konsole
and it told me
"<file> should not be indexed. Ignoring". Looking at the source code I found
the reason:

commit 282c8dff201d19fd6dbaf42a07cb561b644c5b18
Author: Vishesh Handa <me@vhanda.in>
Date:   Tue Jun 17 16:04:49 2014 +0200

    RegExpCache: Use 'QRegularExpression' instead of "QRegExp"

    This results in a performance increase of almost 10x. This is especially
    important because with this we will now consume less cpu when checking
    which files should be indexed, and we will be faster.

The problem with QRegularExpression is that it doesn't support wildcards (see
http://qt-project.org/doc/qt-5/qregularexpression.html#wildcard-matching). So
the exclude filters now match way too much.
Example: There is "*.o" in the exclude filters, this was ok with QRegExp
because it would have meant "Every file/folder ending with .o", but in regexp
this means "Match 0 or more times any character (except newline) o". So every
file/folder ending with o is ignored (that's the case for my videos).

So we could either revert to QRegExp or change the exclude filters to correct
regular expressions. What's your opinion Vishesh?

Reproducible: Always

Steps to Reproduce:
1. Create a file or a folder ending with o
2. Try to index the file with baloo_file_extractor

Actual Results:  
Because the last character is o it matches the exclude filter part "*.o" and is
ignored.

Expected Results:  
It should get indexed.

-- 
You are receiving this mail because:
You are watching all bug changes.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic