Currently, there appears to be a discrepancy with scope names for keywords that indicate a function or class body (or similar), where some get a storage.type scope name and some a keyword scope.
Quoting myself from #550:
In Python,
defandclass_are_ keywords as well, but they not only indicate a different type but also have different effects on the following block (and what goes into the parens following the identifier). To me, this is a pretty sure indicator that scoping bothdefandclassasstorage.typeis wrong since they actually have keyword-like effects on code.storage.typeshould most likely only be used for staticly typed languages like C and Java.The exception to this would be the
async"keyword", which certainly should get astoragescope, althoughstorage.modifier. Furthermore, theglobalandnonlocalkeywords are semantically tied to referring to storage, so it makes sense for those to get a storage scope too.storage.scopeor similar?In JavaScript we have
var,letandconst. I would most likely mark those asstorage.typeand giveclassandfunctionakeyword.scope for the same reason as for Python.
Carrying this over from #550, which is where this was raised initially. I highly suggest reading that issue in full.
My current implementations and the official Sublime Text docs are currently based on the TextMate docs, which explicitly state that keywords such as class and function should be storage.type.
It does seem in popular usage that scoping has split where some use storage.type and others use keyword. In one sense, keyword is sort of broad, and can be argued for almost anything. Additionally, storage.type does not strictly mean a "type" as in an integer. Instead, I think about it more as "classification", or something like that. Considering storage.modifier, most of those are "keywords" also, but there is value is distinguishing them.
I think one of the major downsides of changing storage.type to keyword for class, function, trait, enum, impl, etc is the color shift among users. This will affect a broad number of users, so I don't want to take the decision too lightly.
I think it's definitely odd the way it's done currently. The biggest confusion for me is that things like def, class and such almost always color identically to primitives like int, void and such. This identical coloring stems from the fact that the major _and_ minor scopes are identical (storage.type), with the only distinction coming in the third scoping (storage.type.primitive vs storage.type.class or storage.type.function). That's super-weird and it definitely doesn't align with how non-C languages would group these keywords. It's made even worse by the fact that most color schemes just highlight storage and ignore all of the subscoping, so even moving to storage.modifier (which I agree would be an improvement from the current situation) wouldn't address the issue in practice.
I would be in favor of def, class and so on being scoped as keyword.declaration, and then we make it clear that keyword.declaration is meant for these sorts of things. Right now, the keyword.declaration scope is sort of under-utilized and misappropriated in cases where it _is_ utilized (e.g. Scala's case keyword is, oddly, scoped as keyword.declaration.scala). Things such as void, int and so on should remain as storage.type.primitive. Similarly, var in C# (and similar constructs) should probably rename as storage.type, since it is semantically replacing a type with a "special magic type" that automatically infers. var/val in Scala (and similar languages) would be keyword.declaration, since they're not syntactically replacing types but rather just representing declaration syntax.
I think that a keyword-derived scoping would match user intuition that these are very primitive keywords. An even better example here might be def in Clojure, which makes _absolutely_ no sense scoped as storage.type, but makes all the sense in the world as some sort of keyword. The problem of course is exactly what @wbond pointed out: nearly everything _could_ be a keyword. And as with storage.type, most color schemes don't provide special highlighting for keyword.declaration, even though they probably should.
I'm obviously in favor of making the change, but it's definitely a change which will have immediate and obvious impacts on a large set of users, with the primary benefits to the change being delayed by the lead time on color scheme adjustments.
So we had a discussion about this issue today with @wbond, @FichteFoll and @mitranim
Here is what's happening for major languages.
C++:
ReturnType method() {}
// ^^^^^^^^^^ no scope
Python:
def name() -> ReturnType
# ^^^ storage.type.function.python
# ^^^^^^^^^^ only meta scopes
Java, C#:
ReturnType method() {}
// ^^^^^^^^^^ support.class.java
Go:
func name() ReturnType
// ^^^^ storage.type.keyword.function.go
// ^^^^^^^^^^ storage.type.go
C++ and Python approach is very lazy, and I think we can do better.
The problem with Java approach is that there is then no more difference between user defined types and language level ones. It can make sense for some users but not for others: (#1795, #1803) On the other hand it provides a visual distinction in most schemes.
The Go approach seems tedious for me who "just want" to have a distinct color for "func" and types. My color scheme will look like:
storage.type,
storage.type.keyword:
color: $keyword_color
storage.type.go,
storage.type.other_language_which_respect_the_new_convention,
... :
color: $type_color
Alternative (A): mark "fn" as keyword.
fn name() -> ReturnType
// ^^ keyword.storage.xx
// ^^^^^^^^^^ storage.type.xx
This will change the color of "fn" for all themes where "keyword" and "storage.type" are different.
Alternative (B): use a double scope for types
fn name() -> ReturnType
// ^^ storage.type.xx
// ^^^^^^^^^^ storage.type.xx new_scope_for_type.xx
The nice thing is that if we introduce a new scope, it won't impact existing color schemes.
Color schemes will need to add a new key to specifically target this scope for users who want it to be different than now.
Alternative (C): use a nested scope for types
fn name() -> ReturnType
// ^^ storage.type.xx
// ^^^^^^^^^^ storage.type.return
Note that using a scope like storage.return_type would break all schemes that don't directly target storage.
So WDYT, which alternative do you prefer ?
I think that there are several intertwined questions here.
The first is what to do with fn. This is a keyword, and in a perfect world it should be scoped something like keyword.declaration.function. Given the need for backward compatibility, it might be best to use storage.type.function and agree that storage.type really means keyword.declaration or somesuch.
The second is how to mark a token that syntactically represents a type. In a perfect world, I think that variable.type would usually be appropriate. (In some cases, a constant scope would be appropriate, and in many cases a support scope should be added.) In general, variable really means "identifier or special identifier-like thing", whether or not the thing is actually a variable. In the example, ReturnType seems to be the name of a user-defined type, so a generic variable.type.python would be appropriate.
(Alternatively, we could omit type-specific highlighting and just use a generic expression. In Python, the contents of a function return type annotation are simply an expression following the same rules as any other expression. In order to apply type-specific highlighting to such an expression, we'd have to duplicate much of the existing expression code.)
The third is how to mark one or more tokens representing the return type of a function. In this example, the only token is ReturnType, but it could easily be a complex expression. Not all languages allow complex type expressions, but enough do that we should consider it a core use case. This sounds like a job for a meta scope; marking a whole expression with storage or another non-meta scope seems wrong.
It's fair to say that my opinion boils down to "give up storage as a bad idea (but keep it anyway for compatibility)." The docs say that storage is for "[t]ypes and definition/declaration keywords". This is pretty clearly written for a) languages with var-like keywords and no explicit types and b) languages like C and Java where a type name effectively substitutes for a var keyword. In that context, it kind of makes sense, but when we consider a language like var foo: int; it's obvious that var and int are very different kinds of things.
What I suggest is:
var or function, use storage.type.* as a backward-compatible substitute for the hypothetical keyword.declaration.variable scope (except where a constant scope or something would be more appropriate, and adding support as needed). Do not use storage.storage.storage.modifier as usual.This should be reasonably unintrusive. The biggest change is that in languages like C, the int in int x; would be recolored or even lose color. In deference to the original intent of the storage scope, and to avoid breaking color schemes, we could keep scoping int as storage.type in C/Java-like languages. (I'd rather do this while acknowledging that it doesn't seem like the best scope than try to find an interpretation in which it is the best scope.) In languages where type names do not stand in for var-like keywords, we would still remove storage from type names.
fn name() -> ReturnType
// ^^ storage.type.function
// ^^^^^^^^^^^^^ meta.annotation.return-type (or something)
// ^^^^^^^^^^ variable.type
@Thom1729 I'm inclined towards something like storage.type.function keyword.declaration.function. This gives up backwards compatibility via storage.type, but includes storage.type.function that is occasionally used, but also a path forward for these special type of keywords via keyword.declaration.*. Modern color scheme can use keyword.declaration to highlight var, def, func, etc, and existing color schemes keep working.
That does sound like the best of both worlds. What do you think about removing storage from type names (optionally leaving them in for C-like declarations)?
Let me try to formalize and enumerate the concepts involved. I apologize for the long post, and hope this will put us on a more solid ground.
_type identifier_: identifier referring to type: int, SomeType
_type declaration_: keyword followed by type identifier: type (Go, Rust), data, newtype (Haskell)
_type expression_: keyword followed by type body: func, struct (Go)
_value identifier_: identifier referring to anything
_value declaration_: keyword followed by value identifier: var, let, func
_value expression_: pretty much anything
_function expression_, subset of _value expression_: function keyword followed by function definition
Note: I include function and method declaration under _value declaration_. Many statically typed languages allow functions to be assigned to variables and used as values (C, Go, Rust). Many languages don't differentiate a function declaration from a value declaration where the right-hand side is a function literal (JS, Python, Clojure, Lua).
Go is a rare language where all these concepts are syntactically distinct. Most have dedicated keywords:
type A struct {Field Type}
// ------ type declaration
// ------------------- type expression
var A struct {Field Type}
// ----- value declaration
// ------------------- type expression
var A Type
// ----- value declaration
// ---- type identifier
func A(arg Type) Type {}
// ------ value declaration
func(arg Type) Type {}
// ------------------- type expression
// ---------------------- function expression
Some languages conflate _type expression_ with _type declaration_ by not allowing the former without the latter. Example from Rust:
let A: struct{field: Type}; // invalid syntax
// ----- value declaration
// ------------------- type expression
let A: Type;
// ----- value declaration
// ---- type identifier
struct A {field: Type}
// -------- type declaration
// ------------- type expression
type A = Type;
// ------ type declaration
// ---- type identifier
In C and derivatives, _type declarations_ tend to use special keywords, while _value declarations_ tend to be preceded by a _type identifier_. Note that in C, type identifiers may be composite due to "namespaces". Examples from C:
struct A {};
// -------- type declaration
struct A some_func() {return (struct A){};}
// -------- type identifier
// ------------------ value declaration
struct A some_value;
// -------- type identifier
// ------------------- value declaration
As shown above, well-designed static languages allow to syntactically differentiate _type identifiers_ from _value identifiers_. Example from Go:
var ValueIdent TypeIdent = ValueExpr
// ---------- value identifier
// --------- type identifier
// ---------- value expression
Dynamic languages, where types are first-class values, naturally conflate _type_ concepts with _value_ concepts.
Example of JS conflating _type declaration_, _value declaration_, _type expression_, and _value expression_:
class {method() {}} // type expression OR value expression
class A {method() {}} // type declaration followed by type expression
let A = class {method() {}} // value declaration followed by value expression
Example of JS conflating _type identifier_ and _value identifier_:
class Type {}
// ---------- type declaration
// ---- type identifier
new Type()
// ---------- value expression
// ---- type identifier?
let ValueIdent = Type // both are "types"!
// -------------- value declaration
// ---------- value identifier
// ---- value identifier
One could say that purely dynamic languages like JS or Erlang simply don't have the concept of a type. However, Python and TypeScript complicate the matters by having both _runtime types_ and _static types_. Example from Python:
def is_blank(value: any) -> bool:
# --- static type
# ---- static type
value = str(value)
# --- runtime type
return value == '' or value.isspace()
Languages with a function keyword tend to heavily overload it. The function keyword typically has three distinct purposes:
func A(arg Type) Type {}
// ------ value declaration
// ------------------ function definition
var A = func(arg Type) Type {}
// ---------------------- function expression
var A func(arg Type) Type
// ------------------- type expression
Putting this together, I would lean towards:
have a special "type" scope, i.e. storage
_type identifiers_ receive this "type" scope
Type some_func() {}
^^^^
Type some_value;
^^^^
type A = B // keyword: type declaration
^^^^
var A = B // keyword: value declaration
^^^
func A() {} // keyword: value declaration (of a function)
^^^^
var A struct {Field Type} // keyword: type
^^^^^^
var A func(arg Type) ReturnType // keyword: type
^^^^
Note that these points cover 2 out of 3 known uses of a function keyword, leaving _function expression_ unspecified. Right now, I'm not ready to suggest a scope for it.
Addendum to the previous post: added the missing concept of a function expression; edited the definition, examples, and suggestions accordingly.
This issue should be labeled "RFC".
so a generic
variable.type.pythonwould be appropriate.
Why variable ? Please don't! storage is what types are meant to be scoped and it feels very correct.
Agree with @wbond about storage.type.function keyword.declaration.function. def, fun, fn are no types. They are just special keywords to indicate a function definition/declaration. This is different to what C/C++ looks like.
The third is how to mark one or more tokens representing the return type of a function. In this example, the only token is ReturnType, but it could easily be a complex expression. Not all languages allow complex type expressions, but enough do that we should consider it a core use case. This sounds like a job for a meta scope; marking a whole expression with storage or another non-meta scope seems wrong.
This is what I actually see with Erlang's Typing Language. Return types can easily be sophisticated expressions. But this is what we have meta.function.return-type for. There is not need for new scopes for return types. I scoped the single types with meta.type-call the same way as we do with meta.function-call because of the analogy of defining and using types in Erlang.
-spec function(Arg1 :: int(), Arg2 :: int() | atom()) -> {int(), int() | string() | list(int())}
% ^^^^^^^^ meta.function.erlang
% ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^meta.function.parameters.erlang
% ^^^ meta.function.erlang
% ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ meta.function.return-type.erlang
% ^^^^ meta.type-call.erlang
% ^^^^^^^ meta.type-call.arguments.erlang
% ^^^^ variable.parameter.erlang
% ^^^ storage.type.erlang
% ^^^ storage.type.erlang
% ^^^^ storage.type.erlang
% ^^^ storage.type.erlang
% ^^^ storage.type.erlang
% ^^^^^^ storage.type.erlang
% ^^^^ storage.type.erlang
A big 👍 for @wbond's proposal. As I wrote a few years ago above, I'm a huge fan of getting keyword.declaration into the game, as it really is the ideal scope for these sorts of things. storage.type remains for backwards compatibility, and to allow for things like C#'s var (which I argue should remain storage.type and not convert over to keyword.declaration, since it is a type placeholder).
C#'s
varwhich I argue should remain storage.type and not convert over to keyword.declaration
X( Please keep var with the same scope than func. It's not because var is written at the same position than int that it should have the same scope.
wbond storage.type.function keyword.declaration.function seems the most compatible with existing color scheme, because IIUC it will be styled like a storage.type.
@wbond what are the next steps ? Update the official scope naming guidelines and start applying the new scope in the existing syntaxes ?
Yes, I am working on updating the scope naming docs with various RFCs now. Unfortunately most of these discussions don't lead to easy additions to the docs. :-)
I do think that idea of var in C# and auto in C++ should probably be storage.type and not storage.type keyword.declaration. This is because they are implicit/polymorphic type names, and not a keyword defining a new type. They really should get the same highlighting as int and not func.
Closing as these updated guidelines will be part of the docs with the next release.
Nice to see this change finally coming through. However, the current scope guidelines have an ambiguity I'd like to clear up.
Keywords for classes, structs, interfaces, etc should use the following scopes – this list is not exhaustive.
...
storage.type.struct keyword.declaration.struct
storage.type.interface keyword.declaration.interface
The guideline implies that these keywords are _always_ declarations. Example from C:
struct A {};
^^^^^^ keyword.declaration
^ entity.name
But this isn't always true. Example from C:
struct A some_var = {};
^^^^^^ <namespace?>
^ storage.type
^^^^^^^^ <variable declaration>
In C, when struct doesn't declare a type, it acts as a _namespace_ of sorts. Perhaps it could be storage.modifier, but certainly not keyword.declaration.
Go never uses struct, interface and so on, for declarations. They _always_ denote an anonymous type, which _may_ be typedef-ed or aliased via the type keyword:
type A = struct {}
^^^^ keyword.declaration
^^^^^^ <???>
type B = interface {}
^^^^ keyword.declaration
^^^^^^^^^ <???>
Would appreciate thoughts on how we should scope these.
struct, class , ... are definitely keyword.declaration in all use cases. It's totally weird to think about it as namespace. Even in variable declarations it is used to make clear A is a struct type. We may argue about A to be a storage.type or entity here, maybe.
Sounds like we have different ideas about "declarations". From my perspective, there's a big difference between: (A) a keyword/operator that _adds something to the scope_, and (B): a keyword/operator denoting some type or structure _without_ adding it to the scope. I expected declaration to be reserved for A. Scope-modifying declarations are arguably special, because they're used for the symbol index. Example from Go:
var globalVar = someValue
func globalFunc() {}
type GlobalType struct{field Type}
The last example is particularly interesting, because type is the keyword that adds something to the scope, while struct isn't. In this case, struct with its contents merely denotes an _anonymous type_. Are you sure they should be scoped the same?
Yes, we have different ideas: You are thinking too much like a compiler, while I prefer a sene compromise between sematical meaning and consistent highlighting of keywords.
(The following will sound pretentious. Sorry, couldn't find better phrasing!) Interesting point about thinking like a compiler. Writing and reading software requires a mental interpreter, or several. I've put a lot of effort into making mine as accurate as possible. You've provided an external observation, unprompted, that my syntax suggestions are geared towards helping this mental interpreter, making it easier to see code by its role in the compiler. Very interesting. I hope this isn't seen as the grounds to dismiss my arguments; while we aren't machines, we do have mental compilers which should be aided.
I'm also all for simplicity, and don't have a particularly strong opinion on this declaration stuff. Just want to make sure that the choice of declaration for keywords which don't always _declare_ new types or variables wasn't made with an erroneous assumption that they always declare. If that's a conscious compromise for simplicity sake, that's perfectly fine. Thanks for helping clear this up.
Most helpful comment
@Thom1729 I'm inclined towards something like
storage.type.function keyword.declaration.function. This gives up backwards compatibility viastorage.type, but includesstorage.type.functionthat is occasionally used, but also a path forward for these special type of keywords viakeyword.declaration.*. Modern color scheme can usekeyword.declarationto highlightvar,def,func, etc, and existing color schemes keep working.