[prev in list] [next in list] [prev in thread] [next in thread] List: kwrite-devel Subject: Re: fuzzy-matching in quickopen... From: Alexander Neundorf <neundorf () kde ! org> Date: 2022-09-29 21:24:33 Message-ID: 1779137.8hzESeGDPO () unknown0090f5ef9f13 [Download RAW message or body] [Attachment #2 (multipart/alternative)] Hi, On Dienstag, 27. September 2022 23:02:48 CEST Alexander Neundorf wrote: > Hi Waqar, > > On Dienstag, 27. September 2022 17:05:00 CEST Waqar Ahmed wrote: > > For the second screenshot, results are great for the query. > > "search.ui" is a near perfect match. Instead if you had typed "sc", > > which is much shorter, it would be most likely the first result. **For > > the Nth time, You need to leverage the fuzzy stuff**. > > > > "Prefer Open Files" and then giving a score of 1000 is just completely > > wrong. Sorry. It is not "preferring" anymore, it is brutally bringing > > up an open file even though it might be the worst possible match. > > Thats not how it is supposed to work. > > > > Btw, did you even try the MR where I tagged you? > > sorry, no, I missed that. When was it ? > Can you please post the link again ? I guess it's this one ? https://invent.kde.org/utilities/kate/-/merge_requests/905/diffs? commit_id=98c8356244421c2beff09fde261920175da6c524[1] I gave it a try, in the kate source tree. Unfortunately it changes the behaviour only very little. I tried the following: a few files open, including KateSearchCommand.cpp, and I want to switch to KateSearchCommand.cpp. I don't start by typing "kate" (since this is very non-unique in the kate sources), but I want to switch to the file by typing as few as possible characters starting at "search". Attached are the screenshots. quickopen-1.jpg is without filter, KateSearchCommand.cpp is the second file. So I start typing. After the first character, "s", KateSearchCommand.cpp has disappeared from the list, see quickopen-2.jpg. As a user, I'm confused. I think this shouldn't care about the casing. But as a user I would just consider this a strange glitch and continue typing. So I continue typing, from "se" to "search" not much changes, "KateSearchCommand.cpp" is still not in the visible list, instead there are other files I don't intend to open which also contain (start with) "search". See quickopen-3.jpg. As a user, I might give up at this point. When I add the "c", "searchc", then KateSearchCommand.cpp jumps to the top, see quickopen-4.jpg. At "search", the score for KateSearchCommand.cpp is 290 (326 with the 12.5% increased bonis), the score e.g. for SearchDiskFiles.h is 458. So it seems to me that the bonus for a match if it starts at the beginning is too big. I played around with the bonuses a bit (see attached patch), and got results which I prefer. I set nonBeginSequenceBonus to be the same as sequentialBonus, otherwise a match in the middle of the word gets an increasingly bad score the more characters match, compared to if they match right at the start. To still give the match at the start an advantage over a match in the middle of a word, I increased firstLetterBonus to 35, so it's bigger than the camelBonus. Since "KateSearchCommand.cpp" has already been "punished" for "Search" not starting at the beginning of the string (by camelBonus being smaller than firstLetterBonus), I removed the leading letter penalty again if the camelBonus or separatorBonus are applied. With this, "KateSearchCommand.cpp" appears at the bottom of the list again at "se", and at "sear" it is the top match now already. What do you think ? Alex -------- [1] https://invent.kde.org/utilities/kate/-/merge_requests/905/diffs? commit_id=98c8356244421c2beff09fde261920175da6c524 [Attachment #5 (unknown)] <html> <head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> </head> <body><p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Hi,</p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">On Dienstag, \ 27. September 2022 23:02:48 CEST Alexander Neundorf wrote:</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> Hi Waqar,</p> \ <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> </p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> On Dienstag, \ 27. September 2022 17:05:00 CEST Waqar Ahmed wrote:</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > For the \ second screenshot, results are great for the query.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > \ "search.ui" is a near perfect match. Instead if you had typed \ "sc",</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > which is \ much shorter, it would be most likely the first result. **For</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > the Nth \ time, You need to leverage the fuzzy stuff**.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > </p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > \ "Prefer Open Files" and then giving a score of 1000 is just completely</p> \ <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > \ wrong. Sorry. It is not "preferring" anymore, it is brutally bringing</p> \ <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > up an \ open file even though it might be the worst possible match.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > Thats \ not how it is supposed to work.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > </p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> > Btw, did \ you even try the MR where I tagged you?</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> </p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> sorry, no, I \ missed that. When was it ?</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">> Can you \ please post the link again ?</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I guess it's this \ one ?</p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br \ /></p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><a \ href="https://invent.kde.org/utilities/kate/-/merge_requests/905/diffs?commit_id=98c83 \ 56244421c2beff09fde261920175da6c524">https://invent.kde.org/utilities/kate/-/merge_requests/905/diffs?commit_id=98c8356244421c2beff09fde261920175da6c524</a></p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I gave it a \ try, in the kate source tree.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Unfortunately it \ changes the behaviour only very little.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I tried the \ following: a few files open, including KateSearchCommand.cpp, and I want to switch to \ KateSearchCommand.cpp.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I don't start by \ typing "kate" (since this is very non-unique in the kate sources), but I \ want to switch to the file by typing as few as possible characters starting at \ "search".</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Attached are the \ screenshots.</p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">quickopen-1.jpg \ is without filter, KateSearchCommand.cpp is the second file.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">So I start typing. \ After the first character, "s", KateSearchCommand.cpp has disappeared from \ the list, see quickopen-2.jpg. As a user, I'm confused. I think this shouldn't care \ about the casing. But as a user I would just consider this a strange glitch and \ continue typing.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">So I continue \ typing, from "se" to "search" not much changes, \ "KateSearchCommand.cpp" is still not in the visible list, instead there are \ other files I don't intend to open which also contain (start with) \ "search". See quickopen-3.jpg.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">As a user, I might \ give up at this point.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">When I add the \ "c", "searchc", then KateSearchCommand.cpp jumps to the top, see \ quickopen-4.jpg.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">At \ "search", the score for KateSearchCommand.cpp is 290 (326 with the 12.5% \ increased bonis), the score e.g. for SearchDiskFiles.h is 458.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">So it seems to me \ that the bonus for a match if it starts at the beginning is too big.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I played around \ with the bonuses a bit (see attached patch), and got results which I prefer.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">I set \ nonBeginSequenceBonus to be the same as sequentialBonus, otherwise a match in the \ middle of the word gets an increasingly bad score the more characters match, compared \ to if they match right at the start.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">To still give the \ match at the start an advantage over a match in the middle of a word, I increased \ firstLetterBonus to 35, so it's bigger than the camelBonus.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Since \ "KateSearchCommand.cpp" has already been "punished" for \ "Search" not starting at the beginning of the string (by camelBonus being \ smaller than firstLetterBonus), I removed the leading letter penalty again if the \ camelBonus or separatorBonus are applied.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">With this, \ "KateSearchCommand.cpp" appears at the bottom of the list again at \ "se", and at "sear" it is the top match now already.</p> <p \ style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">What do you think \ ?</p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br \ /></p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;">Alex</p> <p style="margin-top:0;margin-bottom:0;margin-left:0;margin-right:0;"><br /></p> </body> </html> ["tweaked_bonuses.patch" (tweaked_bonuses.patch)] diff --git a/apps/lib/kfts_fuzzy_match.h b/apps/lib/kfts_fuzzy_match.h index d9c638098..ba6ba4074 100644 --- a/apps/lib/kfts_fuzzy_match.h +++ b/apps/lib/kfts_fuzzy_match.h @@ -231,14 +231,14 @@ static bool \ fuzzy_internal::fuzzy_match_recursive(QStringView::const_iterator pa static \ constexpr int sequentialBonus = 25; static constexpr int separatorBonus = 25; // bonus if match occurs after a \ separator static constexpr int camelBonus = 25; // bonus if match is uppercase and \ prev is lower - static constexpr int firstLetterBonus = 15; // bonus if the first letter is \ matched + static constexpr int firstLetterBonus = 35; // bonus if the first \ letter is matched static constexpr int leadingLetterPenalty = -5; // penalty applied for every \ letter in str before the first match static constexpr int maxLeadingLetterPenalty = -15; // maximum penalty for \ leading letters static constexpr int unmatchedLetterPenalty = -1; // penalty for every \ letter that doesn't matter - static constexpr int nonBeginSequenceBonus = 10; - + static constexpr int nonBeginSequenceBonus = 25; + // Initialize score outScore = 100; @@ -287,12 +287,14 @@ static bool \ fuzzy_internal::fuzzy_match_recursive(QStringView::const_iterator pa const bool neighborSeparator = neighbor == QLatin1Char('_') || neighbor \ == QLatin1Char(' '); if (!neighborSeparator && neighbor.isLower() && curr.isUpper()) { outScore += camelBonus; + outScore -= penalty; continue; } // Separator if (neighborSeparator) { outScore += separatorBonus; + outScore -= penalty; } } ["quickopen-1.jpg" (quickopen-1.jpg)] JFIF % % *Exif II* b j ( 1 r 2 i GIMP 2.10.32 2022:09:29 \ 23:21:11 \ ) JFIF C $.' ",#(7),01444'9=82<.342 C 2!!22222222222222222222222222222222222222222222222222 " } !1AQa"q2#BR$3br %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz \ \ w !1AQaq"2B #3Rbr \ $4%&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz \ ? 8C\F\r?7y %h}s \ }dCPJF0Co$rE_"{E v dbPLR5"#)#2 \ >Ve/"{E v ME6[;FNREzKR)#p SxQ \ Y 4y %jj \ p'8Q/sJlHK< ;ȣ@ԩ\ooȸ .W2<lzE}!H#v#~( J^E v <