Hopefully I haven't missed anything here, but it would be really nice if there was an option to add identifier characters based on filetype. The lisp family of languages, for example, allows a very broad set of characters to be used in identifiers. *param*, foo#bar, foo-bar, and foo! are all valid identifiers.
Although I'm more than happy to hack in support myself at this point, it would be so if nice this could be specified from the vimrc say with something like:
au BufNewFile,BufRead *.clj
\ let g:ycm_extended_completion_chars = ["*", "-"]
or
au BufNewFile,BufRead *.clj
\ call YcmExtendCompletionChars(["*", "-"])
I'm not exactly a Vim wizard, so forgive me if the above proof of concept is syntactically incorrect.
This would not be a trivial change, but I've known all along I'll have to make it at some point.
This feature will probably be surfaced as allowing the user to set a per-filetype identifier regex. Currently it's hard-coded in the C++ code: https://github.com/Valloric/YouCompleteMe/blob/master/cpp/ycm/IdentifierUtils.cpp#L37
You could go change this regex yourself temporarily and recompile ycm_core until this feature is implemented. The regex syntax is Boost.Regex.
:+1:
This could be a very helpful addition.
Is there any progress on the issue? This would be very helpful for people who write a lot of HTML and CSS.
At the moment Valloric is busy changing the architecture of the entire YCM plugin so I think everything that is not the client/server architecture will come after.
For now, this Gist will patch YouCompleteMe to support -'s and _'s as identifiers.
What is the current status of this? From looking at ycmd, it looks like it would just be basically modifying IsIdentifierChar (plus support for the new option) correct? If so I may try to take a stab at it.
It seems like it would be most useful if YCM just obeyed Vim's iskeyword setting by default instead of requiring an additional setting.
Yes, add me to the list of CSS/Sass folk that would love to see words include dashes. Not only for things like background-color but I also use hyphens for variable names e.g. $main-color. At present, these aren't getting picked and whilst YCM is immense it would be fantastic if this was also possible. But great work – this, alongside ultisnips are making my Vimming far more productive :+1:
Just wanted to give a heads-up to people that I'm currently implementing this. This requires lots of changes throughout YCM so your patience is appreciated.
You won't be able to customize the identifier chars directly, but YCM will internally use a custom regex for each language with a fallback to [_a-zA-Z]\w*.
Given that, I'd appreciate it if people could provide identifier regexes for various languages. Please include a link to a spec of some sort (if possible; if not, whatever you think describes it well) and various example identifiers for test cases. Note that the regex doesn't have to perfectly reflect the language spec, it just needs to be good enough ~99.9% of the time. Here's the relevant part of the code in my custom-ident branch. Tests are here.
For R it is here:
http://cran.r-project.org/doc/manuals/r-release/R-lang.pdf
Section 10.3.2 has the spec:
Identifiers consist of a sequence of letters, digits, the period (‘.’) and
the underscore. They must
not start with a digit or an underscore, or with a period followed by a
digit.
The definition of a letter depends on the current locale: the precise set
of characters allowed
is given by the C expression (isalnum(c) || c == ’.’ || c == ’_’) and will
include accented
letters in many Western European locales.
I don't know about how to do isalnum correctly to handle all the cases. If
just going with the standard characters it would be
r=re.compile(r".[._a-zA-Z][._a-zA-Z0-9]_|[a-zA-Z][._a-zA-Z0-9]_",
re.UNICODE)
Some test cases:
Good:
a
a.b
a.b.c
a_b
a1
a_1
.a
.a_b
.a1
Bad:
.1a
1a
Let me know if more is needed
On Wed Aug 27 2014 at 5:01:10 PM Val Markovic [email protected]
wrote:
Just wanted to give a heads-up to people that I'm currently implementing
this. This requires lots of changes throughout YCM so your patience is
appreciated.You won't be able to customize the identifier chars directly, but YCM will
internally use a custom regex for each language with a fallback to
[_a-zA-Z]\w*.Given that, _I'd appreciate it if people could provide identifier regexes
for various languages_. Please include a link to a spec of some sort (if
possible; if not, whatever you think describes it well) and various example
identifiers for test cases. Note that the regex doesn't have to perfectly
reflect the language spec, it just needs to be good enough ~99.9% of the
time. Here's the relevant part of the code in my custom-ident branch
https://github.com/Valloric/ycmd/blob/custom-ident/ycmd/identifier_utils.py#L49.
Tests are here
https://github.com/Valloric/ycmd/blob/custom-ident/ycmd/tests/identifier_utils_test.py.—
Reply to this email directly or view it on GitHub
https://github.com/Valloric/YouCompleteMe/issues/86#issuecomment-53639809
.
CSS 3: http://www.w3.org/TR/css-syntax-3/
CSS 2.1 (superseded by 3, linked above): http://www.w3.org/TR/CSS21/grammar.html#scanner
This might get you most of the way there: http://stackoverflow.com/a/2812097/18986
See also: http://stackoverflow.com/questions/448981/what-characters-are-valid-in-css-class-selectors
I've pushed out the code. I haven't yet added the regex for R (thanks @caneff). @lencioni I'll take a look at your links too, thanks.
@caneff Added support for R in https://github.com/Valloric/ycmd/commit/d758333b14f82bc914d38c434c3a94ee75aaa552 (hasn't yet been pulled into YCM though). I went with a slightly different regex: (?!(?:\.\d|\d|_))[\._\w\d]+. Thanks for a link to the spec.
@lencioni Thanks for the links, I updated my initial regex with info you provided. I mostly went with what's listed at: http://stackoverflow.com/questions/448981/what-characters-are-valid-in-css-class-selectors
If we need something better, we can change it later.
I think that's reasonable.
I would like to request adding the hyphen (some people have been calling this the dash, that is not technically correct) for _Javascript_, because even though it is invalid for javascript variable names themselves, a lot of the time (Browser oriented) javascript performs a lot of string manipulation related to CSS and HTML variables, which do have a convention that uses hyphens. This will just make YCM surface more results for completion.
In fact, what would be even more optimal is if for javascript, it will use hyphen _only when the current syntax is a string or comment_. But it's understandable if this is too difficult to implement....
I did not find it in the docs, but is there an easy way to inject a identifier regex for my own esoteric filetype? I basically just want that identifier characters include the : character for all files *.foo. Do I need to patch ycmd/identifier_utils.py for this or is there a setting?
@SirVer There's no setting to my knowledge. I think you have to patch it.
IIRC (I'll see if i can check later), there are multiple places you need to change (I seem to recall there is a thing in the Vim client as well as a thing in the server)
@Valloric why was this issue closed if there's no setting for this in vim? or is this issue only about ycmd and you prefer to open a new issue for vim?
Someone already mentioned obeying Vim's iskeyword setting would be the way to go with this.
I think it is worth mentioning again.
@Valloric can you please reopen this issue?
Thanks!
No. If there is a new request then we need a new issue. This was closed 5 years ago.
Most helpful comment
It seems like it would be most useful if YCM just obeyed Vim's
iskeywordsetting by default instead of requiring an additional setting.