[prev in list] [next in list] [prev in thread] [next in thread]
List: kwrite-devel
Subject: Re: fuzzy-matching in quickopen...
From: Alexander Neundorf <neundorf () kde ! org>
Date: 2022-09-25 20:49:37
Message-ID: 4382518.Wku2Vz74k6 () unknownc4d9870202f1
[Download RAW message or body]
Hi,
On Samstag, 24. September 2022 00:06:34 CEST Waqar Ahmed wrote:
> I am against adding the old way, but if it's optional, ok sure as long as
> it is disabled by default.
>
> Your approach is completely incorrect though and the only reason I will say
> ok to the patch is because Christoph already said ok. We can and should
> improve the algorithm instead rather than just bringing back the old way on
> the first complaint.
Here are 3 examples (in the kate source tree) where the calculated score is
IMO not good:
I want to switch to "KateSearchCommand.cpp", which is already open.
filter "ese":
KateSearchCommand.cpp gets a score of 113
MultilineStartEndOfLineMatch.txt gets a higher score of 116, even though it
does not contain the string "ese", but only the "eS" and "E" with 4 characters
inbetween
I think a string which contains the filter exactly should get a higher score
than a string which "just" contains the characters.
filter "tes":
KateSearchCommand.cpp score gets a score of 118 and comes in place 23, i.e.
not visible without scrolling.
tests.qrc score gets a higher score of 159, probably because it starts with
"tes", but it is not open yet. There are about 20 files which start with
"test", they are all not open.
I often leave out the start of the filename, because often this is the same for
many files in a project (e.g. "kate" in kate, or "q" in Qt, or "algo" in some
other project), so I start typing with something in the middle of the filename.
So I'd suggest that the "is open" bonus should be bigger than the "starts
with" bonus.
Different example: I want to switch to "kfts_fuzzy_match.h"
filter "fts":
kfts_fuzzy_match.h gets a score of 100
filetree_model_test.cpp gets a higher score of 120. Again, I'd suggest that a
string which contains the filter string exactly should get a higher score than
a string which "just" contains the characters.
The following gives IMO better results:
bonus for "already open" = 15
if (matched) {
int sequentialBonus = 25;
int separatorBonus = 10; // bonus if match occurs after a separator
int camelBonus = 10; // bonus if match is uppercase and prev is lower
int firstLetterBonus = 10; // bonus if the first letter is matched
int leadingLetterPenalty = 0; // penalty applied for every letter in str
before the first match
int maxLeadingLetterPenalty = 0; // maximum penalty for leading letters
int unmatchedLetterPenalty = -1; // penalty for every letter that doesn't
matter
int nonBeginSequenceBonus = 20;
I'm not sure I understand this. Doesn't this mean that a long filename gets a
big bonus ?
// extra points if file exists in project root
// This gives priority to the files at the root
// of the project over others. This is important
// because otherwise getting to root files may
// not be that easy
if (!matchPath) {
score += (sm->idxToFilePath(sourceRow) == name) * name.size();
Alex
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic