Sentences using specific characters like: 脕, 脺, 驴, 脩, etc, aren't displayed.
Client: MattDuskson triggered JS error: Error: The URI to be decoded is not a valid encoding | url: http://127.0.0.1:51122/browserOutput.js | line: 164 | column: 2 | error: URIError: The URI to be decoded is not a valid encoding | user agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729)
This is probably messing with any non-english server. Would be great if we get a fix.
/vg/ has some rust or go or something or another based fix for this that converts utf-8 shit over to html encoded entries
@PJB3005
Almost correct. I made a Rust library called libvg that converts the input from the client to UTF-8, then sends that.
The relevant PR is this: https://github.com/vgstation-coders/vgstation13/pull/13537 It's not all that descriptive but this is what it does and how it works. Also there's some changes later (adding detection for gb2312
and _autodetect
, a glitch) so use the latest versions of the touched lines.
It makes goonchat detect the locale-set encoding of the dream seeker client by using the non standard IE variable document.defaultCharset
. This is the Windows CodePage that players can enter text with in BYOND input boxes. Most people have this at windows-1252
which is Western, but windows-1251
has Cyrillic, etc...
With this information, input by the client is ran through libvg which converts it to UTF-8. It also provides a set of UTF-8 aware replacements for things like length()
and findtext()
. Oh yeah it's like also the most tested piece of code on /vg/.
The UTF-8 can be handled fine by the DD server because BYOND strings are just hard byte strings without any sort of verification whatsoever. The UTF-8 is sent to the client's goonchat by using a hack of double url encode. The first URL encode is to make it valid for output()
for the JS call client side, and BYOND decodes this, but as \
So yeah, port the PR, and some later PRs like static linking of the CRT for the binary provided in the repo and things like Travis checking it automatically.
Most helpful comment
Almost correct. I made a Rust library called libvg that converts the input from the client to UTF-8, then sends that.
The relevant PR is this: https://github.com/vgstation-coders/vgstation13/pull/13537 It's not all that descriptive but this is what it does and how it works. Also there's some changes later (adding detection for
gb2312
and_autodetect
, a glitch) so use the latest versions of the touched lines.It makes goonchat detect the locale-set encoding of the dream seeker client by using the non standard IE variable
document.defaultCharset
. This is the Windows CodePage that players can enter text with in BYOND input boxes. Most people have this atwindows-1252
which is Western, butwindows-1251
has Cyrillic, etc...With this information, input by the client is ran through libvg which converts it to UTF-8. It also provides a set of UTF-8 aware replacements for things like
length()
andfindtext()
. Oh yeah it's like also the most tested piece of code on /vg/.The UTF-8 can be handled fine by the DD server because BYOND strings are just hard byte strings without any sort of verification whatsoever. The UTF-8 is sent to the client's goonchat by using a hack of double url encode. The first URL encode is to make it valid for. The trick is that we encode twice, so the client side JS actually recieves URL encoded UTF-8, which it decodes fine and now Russians can cyka blyat. This last part is already in on here (it came BEFORE the libvg PR) and it's what causes the exceptions JS side for this bug report: the failed UTF-8 decoding because the 127+ bytes aren't valid UTF-8.
output()
for the JS call client side, and BYOND decodes this, but as \So yeah, port the PR, and some later PRs like static linking of the CRT for the binary provided in the repo and things like Travis checking it automatically.