Godot: Converting unicode to character

Created on 16 Aug 2016  Â·  12Comments  Â·  Source: godotengine/godot

Operating system or device - Godot version:
Ubuntu 14.04 - Godot HEAD

Issue description (what happened, and what was expected):
Unable to convert a unicode value into a string, but apparently when printing a key event it shows the unicode as a string

enhancement junior job gdscript

Most helpful comment

amazing, i didn't even think this was possible, hehe..

maybe we could do some function like String().plus_char(c) or something
like this

On Thu, Aug 18, 2016 at 2:04 PM, Leonard Meagher [email protected]
wrote:

Closed #6166 https://github.com/godotengine/godot/issues/6166.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/godotengine/godot/issues/6166#event-760205955, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AF-Z2_2PFkND4N_FJtxImcl7mr9npoekks5qhJCMgaJpZM4Jk4wW
.

All 12 comments

Found a solution:

RawArray([unicode]).get_string_from_utf8()

amazing, i didn't even think this was possible, hehe..

maybe we could do some function like String().plus_char(c) or something
like this

On Thu, Aug 18, 2016 at 2:04 PM, Leonard Meagher [email protected]
wrote:

Closed #6166 https://github.com/godotengine/godot/issues/6166.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/godotengine/godot/issues/6166#event-760205955, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AF-Z2_2PFkND4N_FJtxImcl7mr9npoekks5qhJCMgaJpZM4Jk4wW
.

@reduz what about a char function in @GDScript?

Just reopening till maybe the char function comes in (which would be preferred).

Found a solution:

RawArray([unicode]).get_string_from_utf8()

A few notes on this:
1 - RawArray requires every element to be between 0-255 and will truncate (not split!) the values you pass to it.
2 - Unicode and utf8 are not the same thing

The RawArray trick will not work with multi byte unicode chars (e.g. ŧ, see https://en.wikipedia.org/wiki/T_with_stroke ).
You need something like this to encode the multi-byte char into a rawarray

var code = event.unicode
var arr = RawArray()
while code > 0:
    arr.append(code & 0xFF)
    code = code >> 8
arr.invert()

This will give you the proper unicode value in the RawArray.

Sadly, even if the byte data is correct, the function get_string_from_utf8 will not work, because the data will be unicode and not UTF-8.

Just to confirm, in my implementation of char, this:

print(char(358),char(358).to_utf8().size())

Results in:


So, it is correct... probably.

Yes, it's correct!

The only scary thing is that size is 2, which is correct (2 bytes in that string), but it might be useful to have a char_size() function in String returning the actual number of characters in the string? Just a suggestion

@Faless char(358).length() is 1, I just wanted to get those 2 bytes counted... :smile:

In Godot C++ world, String is a kind of Vector, so size() returns the length of the vector, which includes the NULL that ends the string. String.length() returns size() - 1 to take this into account.

@Faless char(358).length() is 1, I just wanted to get those 2 bytes counted... :smile:

Right, my bad :).
All seems good then, I just tested this:

var string = char(358) + char(358)
printt(string, string.to_utf8().size(), string.length())
# Output:
# ŦŦ  4   2

@vnen That's incorrect AFAIK... to_utf8 would simply convert the string to utf8 bytes. Since char(358) is represented by two bytes, though, it would return a to_utf8().size() of 2 _bytes_, while length() returns just one _character_

Just for a note: char() function was added. This code snipped correctly prints utf-8 characters entered by the user:

func _unhandled_input(event):
    if event is InputEventKey:
        print(char(event.unicode))
Was this page helpful?
0 / 5 - 0 ratings