Rust: On identifier not found, detect swapped words

Created on 2 Dec 2019  Â·  6Comments  Â·  Source: rust-lang/rust

On identifiers that have more than one word, it is relatively common to write them in the wrong order (foo_bar_baz → foo_baz_bar). These are normally not found by Levenshtein distance checks, but we could do a basic "split on _, sort and join before comparison" so that we could suggest the right identifier.


This issue has been assigned to @cjkenn via this comment.


A-diagnostics A-lint A-suggestion-diagnostics C-enhancement D-papercut E-easy P-low T-compiler

All 6 comments

I'd like to try to work on this.

As foo_bar_baz is short enough to be caught by the Levenstein distance, I wrote this example for testing:

fn main() {
    let long_variable_name = true;
    println!("{}", long_name_variable);
}

@rustbot claim

@estebank I'm thinking about the priority of the different possible matches. I tend to think that a match found by levenstein distance should have a greater priority, but what do you think?

@LeSeulArtichaut it's tricky. I feel that your instinct might be right but we'd only know when looking at real world usage. The nice thing is that the lev distance check by its very nature it is limited, so doing the swapped words check as a fallback should work out ok in practice.

Releasing my assignment, as I am not sure to be able to commit.
In case someone want to take on this, here is the function that should be modified.
https://github.com/rust-lang/rust/blob/ae1b871cca56613b1af1a5121dd24ac810ff4b89/src/libsyntax/util/lev_distance.rs#L46-L82
@rustbot release-assignment

(I think this is appropriate, please tell me if it isn't)
@rustbot modify labels to +E-Easy

I can take a stab at this, seems reasonable with previous comments.

@rustbot claim

Was this page helpful?
0 / 5 - 0 ratings