Godot-proposals: Allow Unicode characters in GDScript identifiers

Created on 27 May 2020  ·  21Comments  ·  Source: godotengine/godot-proposals

Describe the project you are working on:
2d space game

Describe the problem or limitation you are having in your project:
Can't use scientific symbols or accented letters (and my native language has some, often creating minimal pairs with unaccented ones) in variable names (scientific symbols would massively shorten some variables I use)

Another example use case: https://github.com/godotengine/godot/issues/24785#issuecomment-495978331

Describe the feature / enhancement and how it helps to overcome the problem or limitation:
Allow unicode characters in GDScript identifiers

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

If this enhancement will not be used often, can it be worked around with a few lines of script?:
Nope, requires core changes (parser)

Is there a reason why this should be core and not an add-on in the asset library?:
Not possible to do via add-on due to parser changes.

Original issue: https://github.com/godotengine/godot/issues/24785

IIRC this is not covered in @vnen's GDScript rework.

gdscript

Most helpful comment

Well, this works. Question is if we really want it:

gdscript-emoji

All 21 comments

I think @vnen was working on adding this a few days ago.

Well, this works. Question is if we really want it:

gdscript-emoji

BTW, I did this very naively in this example, accepting anything that's beyond basic ASCII range. This would accept symbols and things that look like space to be part of identifiers.

Doing this properly would require following the Unicode Standard Annex 31: https://unicode.org/reports/tr31

Or maybe we can expect users to use this responsibly and any report of those characters being allowed will be closed as not-a-bug.

@vnen We could probably disallow having irregular whitespace characters anywhere else than in strings and comments.

I would like to support the use of unicode for identifiers。
The end result of this question, support or no support?

The popular c# and python both support unicode, and if godot wants to have more users in the non-English speaking world, it must support unicode

I would like to support the use of unicode for identifiers。
The end result of this question, support or no support?

This is an open discussion, there's no conclusion yet.

There's a few challenges to overcome:

  1. The code editor font has no fallback. So you need a font with all glyphs otherwise you might put a character that doesn't show in the editor. This needs to be fixed.
  2. As I mentioned, we should probably follow UAX#31 (like Python does) instead of allowing any character in the identifier. Otherwise space characters could be inserted in an identifier and become tricky to see. Maybe it's not a problem: let users misuse it, but I want some conclusion on this regard.

We could probably disallow having irregular whitespace characters anywhere else than in strings and comments.

That's already the case, I think. But to forbid those inside identifiers would need a blacklist of sorts.

Yes, I very much want it. But perhaps it might cause trouble to implement something like that in declarations. For strings & comments however I am fully aboard.

Yes, I very much want it. But perhaps it might cause trouble to implement something like that in declarations. For strings & comments however I am fully aboard.

It already works on strings and comments. The problem, as I mentioned, is that the code editor font doesn't have full Unicode and it doesn't allow fallback fonts. So if you want emoji or something, you have to change to a font which support those (like I did in the example image) and I couldn't find a monospace font that worked.

So what we would need to do is find a monospace font that would look good in Godot's code editor, has full unicode support and the proper licensing... then we can make it a proposal so that emoji support could be implemented, correct?

That's probably it, I don't know.

@agameraaron I don't know of any open source monospace font that includes good emoji support.

Hack and its parent DejaVu Sans Mono have a very extensive character set, but they don't support colored emoji. (Monochrome emoji can be tough to understand, so I wouldn't recommend settling for them.)

Also, why wouldn't the code editor font allow fallbacks? It uses a DynamicFont just like everything else in the Godot editor.

@Calinou

Also, why wouldn't the code editor font allow fallbacks? It uses a DynamicFont just like everything else in the Godot editor.

The problem is that the settings only ask for a font path, not a DynamicFont. It doesn't have any fallback option. That's probably easy to solve but right now it's an issue.

@agameraaron

So what we would need to do is find a monospace font that would look good in Godot's code editor, has full unicode support and the proper licensing... then we can make it a proposal so that emoji support could be implemented, correct?

No, it is already supported (in comments and strings that is). It's just that the editor default font doesn't have emojis. So if you have emojis in there they won't be shown, which can be confusing (but nothing is really stopping one from doing it). If you use an external editor you can see those characters.

What we need is fallback font setting to show all characters by default. It doesn't matter much if emojis are monospaced IMO. But the regular characters should be.

The proposal here is for identifiers, which currently don't allow anything other than basic ASCII letters and numbers (and underscore). But that requires following the standard, at least in my view, which is not trivial.

@vnen Right, that makes sense. We should probably find a way to load the system emoji font as a fallback, as emoji fonts are notoriously large in terms of file size (bundling them in the binary would likely enlarge it significantly). However, the exact paths for these fonts are OS-specific and often require guesswork.

Even if this was supported, isn't it usually considered best practice to write code in English?

Also, I have cross-language portability concerns with this proposal, since 💩 is not a valid identifier in C#.

Also, I have cross-language portability concerns with this proposal, since 💩 is not a valid identifier in C#.

It is already possible to name a GDScript function if, new, or a myriad of other things. Likewise, you can name a C# function or. But, since both languages can access the call/get/set API, they can always call such functions through it.

(And GDNative can already assign arbitrary character arrays for function names, likely including empty strings. Restricting those so that all languages are "happy" is not going to be elegant.)

since 💩 is not a valid identifier in C#.

It's not valid in Python either. Hence my concern with following the proper UAX#31 standard, which would correctly disallow some weird stuff in identifiers in general, while still allowing multi-language support.

What I'm interested in is that variable names can use Chinese characters or French characters. I'm not interested in using emoticons for variable names.

Unless 'weird stuff' covers scientific symbols like https://en.wikipedia.org/wiki/Astronomical_symbols (mostly interested in Sun and Earth) or ρ (the lower case Greek letter rho) which is used for density, or other such stuff, I don't care. Emoji aren't the reason I posted the proposal, after all.

@Zireael07 BTW those don't seem to be allowed in Python. Not sure about other languages.

For me weird are things that gets confusing, like other types of space or a quote character that make it look like a string (my main concern about this is editors/keyboards that might insert them, or a copy-paste with formatting). But again, maybe we can just don't care at all and let users use whatever they want in the identifiers.

Was this page helpful?
0 / 5 - 0 ratings