Kakoune: Fuzzy finder accuracy

Created on 14 Oct 2020  路  5Comments  路  Source: mawww/kakoune

The behavior of the builtin fuzzy finder puzzles me. Consider this example:

image

As you can see, the file I actually want, common_startup.rb, is listed dead last, even though I have entered its filename almost verbatim. As a user coming from Sublime Text, and having written my own fuzzy finder implementations in the past, I don't understand the sorting of results. I've tried to read through the source code of the fuzzy search in Kakoune, but couldn't immediately make sense of its sorting algorithm. Could anyone provide more information on how it prioritizes results, and what I should have entered here instead to get the file I wanted?

Most helpful comment

I believe it is prioritizing matching after path separators / (or word boundaries in general), if I remember correctly from @mawww describing it on IRC recently (someone more knowledgeable or remembering better can correct me). Then that ordering makes sense because it mostly matched .../common/.../startup/... patterns.

FWIW the matching logic is here. If I am interpreting it correctly, it seems that matches that satisfy certain flags (as determined here) are always prioritized, then if the flags are the same another heuristic comes in that includes the word boundary counting logic.

If you used an exact substring like common_startup or startup.rb I imagine the correct file would take precedence over others.

All 5 comments

I believe it is prioritizing matching after path separators / (or word boundaries in general), if I remember correctly from @mawww describing it on IRC recently (someone more knowledgeable or remembering better can correct me). Then that ordering makes sense because it mostly matched .../common/.../startup/... patterns.

FWIW the matching logic is here. If I am interpreting it correctly, it seems that matches that satisfy certain flags (as determined here) are always prioritized, then if the flags are the same another heuristic comes in that includes the word boundary counting logic.

If you used an exact substring like common_startup or startup.rb I imagine the correct file would take precedence over others.

What completions is your find command using? After some digging it seems to be -shell-script-candidates.
I can reproduce with

define-command -override x -params 1.. %{
} -shell-script-candidates %{
    echo flow/common/lib/startup/CStartFlowingParser.rb
    echo lib/ruby/cosim/helpers/common_startup.rb
}

and typing :x commonstartup.rb.

The responsible part of the ranking is here:
the CStartFlowingParser.rb option hits more word boundaries than the common_startup.rb (4 vs. 2), and thus is sorted earlier.

Side note: typing a character that occurs earlier(?) in the second option makes them switch order: :x l.
This completion is generic, not specific to filenames, but it still special-cases /.
At least -buffer-completion prioritizes matches to the basename (which would work for your example), not sure if that would also make sense here.

Yes, I'm using -shell-script-candidates for completion. Thanks for the explanation of how it works currently, I see now why it would favor the path where the words appear after word boundaries.

I wish that the fuzzy matching placed more weight on the number of splits in the needle, and the size of the splits, rather than just focusing on word boundaries. Matching flow/common/lib/startup/CStartFlowArgsParser.rb from commonstartup.rb requires splitting the needle into three terms: common startup .rb, rather than only two to match common_startup.rb: common startup.rb. The number of intervening characters that must be added between the splits is lower for common_startup.rb as well, just a single underscore vs /lib/ and /CStartFlowArgsParser.

Is the fuzzy search algorithm pretty set, or would a pull request with different behavior be welcome?

The fuzzy matching behaviour is not set at all, always happy to improve it, I agree the ordering seems wrong here.

This case should be added to the test_ranked_match in ranked_match.cc then the ordering logic should be changed to ensure the tests pass.

Hi! Just to second @greneholt , I would really like to see a different matching algorithm for the fuzzy finder. I usually find that with fzfI need fewer keystrokes to reach some item compared to Kakoune on the same candidates list.

Also, I find Kakoune's algorithm less stable, in the sense that many times I reach a situation where the item I want is, say, in the 4th position of the list and, to avoid pressing tab 4 times, I just enter another letter to filter the list a bit further and, suddenly, the desired item goes down in the list, say, to the 7th position. It's counterintuitive.

For that, I can understand why many people prefer to use an external tool like fzf to do things they could easily do from within Kakoune using its built in fuzzy finder, like switching buffers or opening files.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

radare picture radare  路  3Comments

abitofalchemy picture abitofalchemy  路  3Comments

hwmack picture hwmack  路  4Comments

a12l picture a12l  路  3Comments

alexherbo2 picture alexherbo2  路  3Comments