Vscode: Can I get scope / scopeRange at a position?

Created on 24 Nov 2015 · 23Comments · Source: microsoft/vscode

_From @billti on November 1, 2015 6:10_

The API call document.getWordRangeAtPosition(position) appears to use its own definition of a word. For example, my tmLanguage defines attrib-name as a token/scope, yet getWordRangeAtPosition appears to break this into 2 words on the - character.

How can I get token ranges at a position based on my custom syntax? (And it would be really useful if I could get the scope name that goes along with it too).

_Copied from original issue: Microsoft/vscode-extensionbuilders#76_

api feature-request tokenization

Source

seanmcbreen

👍48

Most helpful comment

In the meantime, I've published an extension, scope-info, that provides an API to query the scope at a particular position. It works by querying the installed extensions for language definitions and grammars, and then maintains a parse-state for each open document using vscode-textmate. Only one instance will exist per vscode instance, regardless of how many other extensions depend on it.

Example usage:

import * as api from 'scope-info'
async function example(doc : vscode.TextDocument, pos: vscode.Position) {
    const siExt = vscode.extensions.getExtension<api.ScopeInfoAPI>('siegebell.scope-info');
    const si = await siExt.activate();
    const t1 : api.Token = si.getScopeAt(doc, pos);
}

Notes:

For typings, refer to scope-info.d.ts.
You can also query the vscode-textmate-IGrammar and scope name of a language.
Your extension should list 'siegebell.scope-info' as an extensionDependency.
If multiple extensions contribute to the same language, scope-info may pick the wrong one.
Scope-info might return a scope corresponding to a slightly newer or older document version than what your extension thinks is current.
Pull requests are welcome.

siegebell on 14 Dec 2016

👍5 🎉3 ❤1

All 23 comments

_From @vilic on November 1, 2015 15:19_

:+1:

seanmcbreen on 24 Nov 2015

_From @egamma on November 2, 2015 8:16_

Exposing the scope names in the API is on the backlog, but will not make it into the November update.

seanmcbreen on 24 Nov 2015

_From @jrieken on November 2, 2015 18:16_

@billti despite the lack of access to scopes you can define your a custom word definition such that it will be picked up by document.getWordRangeAtPosition. You can register a ITokenTypeClassificationSupport which can contribute a regex to classify words.

seanmcbreen on 24 Nov 2015

_From @billti on November 2, 2015 19:3_

Thanks @jrieken , I spotted that, and it may be a useful interim solution. But generally for now, if I want to know the classification accurately for a position in a CFG, seems I'll need to document.getText() and run my own parser over it - is that right?

seanmcbreen on 24 Nov 2015

_From @jrieken on November 3, 2015 9:59_

unfortunately yes

seanmcbreen on 24 Nov 2015

@egamma on November 2, 2015 8:16
Exposing the scope names in the API is on the backlog, but will not make it into the November update.

Is there any update on if/when we can expect a way to get the scopes at a position or offset?

hoovercj on 14 May 2016

👍3

@hoovercj all I can currently say is that this is still on the backlog, sorry.

egamma on 16 May 2016

@egamma Any progress on this? Is there any way I can contribute? :)

TimonVS on 13 Oct 2016

Would it be trivial to provide a command that returns a url to the TextMate grammar file being used for a particular document/languageId (or return the contents of the file to keep them read-only)? Then we could use vscode-textmate ourselves to get the token info at a particular location.

siegebell on 14 Oct 2016

@siegebell -- As a short-term solution, I have successfully included a textmate grammar with my extension , referenced that, and referenced the built-in vscode-textmate package to get token scopes in an extension.

It's a pain and it really should be part of the API, but it's definitely possible to do today.

I was given the advice to use: var tm = require(path.join(require.main.filename, '../../node_modules/vscode-textmate/release/main.js')); to access vscode-textmate, but since I have a language server I had to evaluate require.main.filename in the language client and pass it over with the initializationOptions to get the right value in my server.

hoovercj on 14 Oct 2016

@TimonVS exposing the scopes in API requires that we re-visit the internal representation of scopes, this requires major re-architecting and this makes challenging to open-up for contributions.

egamma on 14 Oct 2016

In the meantime, I've published an extension, scope-info, that provides an API to query the scope at a particular position. It works by querying the installed extensions for language definitions and grammars, and then maintains a parse-state for each open document using vscode-textmate. Only one instance will exist per vscode instance, regardless of how many other extensions depend on it.

Example usage:

import * as api from 'scope-info'
async function example(doc : vscode.TextDocument, pos: vscode.Position) {
    const siExt = vscode.extensions.getExtension<api.ScopeInfoAPI>('siegebell.scope-info');
    const si = await siExt.activate();
    const t1 : api.Token = si.getScopeAt(doc, pos);
}

Notes:

For typings, refer to scope-info.d.ts.
You can also query the vscode-textmate-IGrammar and scope name of a language.
Your extension should list 'siegebell.scope-info' as an extensionDependency.
If multiple extensions contribute to the same language, scope-info may pick the wrong one.
Scope-info might return a scope corresponding to a slightly newer or older document version than what your extension thinks is current.
Pull requests are welcome.

siegebell on 14 Dec 2016

👍5 🎉3 ❤1

exposing the scopes in API requires that we re-visit the internal representation of scopes, this requires major re-architecting and this makes challenging to open-up for contributions

@alexandrudima I believe the above was done as part of #18317

@aeschli Will #18068 be covering the feature ask in this current issue or are we suggesting extension authors to use https://marketplace.visualstudio.com/items?itemName=siegebell.scope-info?

ramya-rao-a on 16 Jan 2017

Alex added a developer tool that lets you see the tokens at a location. See https://github.com/Microsoft/vscode/pull/17933#issuecomment-271515251

There's still no extension API that returns text-mate scopes. Several reasons for that one of them that we don't want that extensions start depending on a particular tokenizer grammar.

aeschli on 19 Jan 2017

I think it is enough get the color at position, then associate it to an applied style: string, number, keyword...

APerricone on 13 Jan 2018

This would also be very useful for me. I'm writing an extension for a custom ebnf syntax. The textmate grammar has all the information needed to provide linting, even for renaming symbols and basic syntax validation. (For this just filter the tokens by not having any scope attached -> unexpected token & syntax error)

I currently load the 'vscode-textmate' module that comes with vscode using some dirty workaround and use that to reparse open files. It's a lot of wasted CPU time and I can't easily do incremental changes. (I assume vscode already does this internally to speed up syntax highlighting)

Here's a few functions I could use:

Get token at position
Get a list of tokens for the entire file, or in a Range
Get token text, scopes & Range
Get a list of tokens filtered by scope (this can be achieved by using the above two, but could be optimized separately)
Open additional files and get them tokenized in the background (for #include directive)

Here's my extension for some reference on how this information can be used:
https://github.com/Victorious3/vscode-TatSu/blob/635d3c1351b55048feac44f09203a95f1fc0c275/server/src/parse.ts

Victorious3 on 26 Sep 2018

👍5

I don't understand why in my language extension, I need to re-parse all file to know if a character is commented, is string or not,
other ideas:

grammar correction only for string and comments
separate editor for escaped string where \r n are converted (like language injection of IntelliJ )
regex visualizer for regex token
etc etc

APerricone on 9 Jan 2019

👍4

aeschli commented on Jan 19, 2017

There's still no extension API that returns text-mate scopes. Several reasons for that one of them that we don't want that extensions start depending on a particular tokenizer grammar.

That's actually very sad. The grammar already did most of the work needed for making outlines, and now we have to start all over, type all the same REGEX in to a TypeScript module and repeat it to get the same data?

Extensions covering languages that don't have servers usually bring their own grammar files too, so why can they not rely on the same grammar file for both needs.

Actually I think VS Code should build the outline from the grammar scopes (for languages that don't already have a symbol provider), as it would increase the number of languages that would benefit from the outline feature. The textmate grammar system is severely underutilized.

This is in addition to the common language extensions needs, such as knowing comments and strings.

msftrncs on 23 Jul 2019

👍6

Write like this?

document.getText(document.getWordRangeAtPosition(position, /[a-zA-Z_][\-a-zA-Z0-9_]*/));

nagq on 10 Sep 2019

I just ran into this issue when trying to extend auto-correct to behave in a smart way depending on the current cursor environment. Hence, I would love to see this functionality as well!

jchtt on 22 Aug 2020

👍1

Are there any possible workarounds to get this working with the extension test host? I'd love to be able to write an end-to-end test to validate semantic highlighting is working, but couldn't find a way.

anthony-c-martin on 26 Oct 2020

I'd love to be able to write an end-to-end test to validate semantic highlighting is working, but couldn't find a way.

An example ive seen is here:
https://github.com/styled-components/vscode-styled-components/blob/master/src/tests/suite/colorization.test.js

The idea is it has a fixture file, then calls captureSyntaxTokens and validates that against a pre defined results file. I'm not sure if there's more efficient ways but it works as an end to end test for syntax highlighting. I don't know if this changes for semantic highlighting

jasonwilliams on 26 Oct 2020

An example ive seen is here:
https://github.com/styled-components/vscode-styled-components/blob/master/src/tests/suite/colorization.test.js

Thanks for the suggestion! Looks this works well for testing TextMate grammars, but unfortunately not for semantic tokenization, as it invokes the grammar directly.

anthony-c-martin on 26 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Show other invisible characters with renderWhitespace option

sijad · 3Comments

How to highlight a .tpl file as HTML

biij5698 · 3Comments

Full uninstall does not remove installed extensions

vsccarl · 3Comments

where should i put my global tasks.json file vscode keeps creating it locally

NikosEfthias · 3Comments

[sass] @error and !global are showing as errors

VitorLuizC · 3Comments