toLocaleUpperCase() does not work properly for Georgian locale.
_v10.7.0 and above_
> 'αααααα α'.toLocaleUpperCase();
'α²α²α²α²α²α² α²'
Expected behaviour
_Till node v10.6.0_
> 'αααααα α'.toLocaleUpperCase();
'αααααα α'
FWIW:
> [...'αααααα α'.toLocaleUpperCase()].map(ch => ch.codePointAt().toString(16))
[
'1c98', // GEORGIAN MTAVRULI CAPITAL LETTER IN (U+1C98)
'1c90', // GEORGIAN MTAVRULI CAPITAL LETTER AN (U+1C90)
'1c9c', // GEORGIAN MTAVRULI CAPITAL LETTER NAR (U+1C9C)
'1c95', // GEORGIAN MTAVRULI CAPITAL LETTER VIN (U+1C95)
'1c90', // GEORGIAN MTAVRULI CAPITAL LETTER AN (U+1C90)
'1ca0', // GEORGIAN MTAVRULI CAPITAL LETTER RAE (U+1CA0)
'1c98' // GEORGIAN MTAVRULI CAPITAL LETTER IN (U+1C98)
]
cc @nodejs/v8 @nodejs/intl Is it regression or fix?
Was there an ICU update between 10.6 and 10.7?
Edit: yes: https://github.com/nodejs/node/commit/122ae24f62de6f848eadcf72b75dff6114cf0079
The ICU 62 changelog says:
The Unicode 11.0 changes may also require some code/tests to be fixed. Notably:
And the Unicode 11.0.0 changelog says:
Casing Issues
Casing behavior for the Georgian script has changed significantly. There is a new set of Mtavruli capital letters (U+1C90..U+1CBA, U+1CBD..U+1CBF) in Unicode 11.0, with case mappings to the existing Mkhedruli letters (U+10D0..U+10FA, U+10FD..U+10FF). In prior versions of the Unicode Standard, Mkhedruli Georgian was considered a monocameral (non-casing) script, and the Mkhedruli Georgian letters were gc=Lo. Starting with Version 11.0, those Mkhedruli Georgian letters are now gc=Ll, and have uppercase mappings to Mtavruli Georgian capital letters. This change will have major implications for Georgian implementations, including changes for input methods, fonts, casing, and string matching. Existing implementations have treated Mtavruli headlines and other uses for textual emphasis as a text style, so there will also be significant issues for document conversion and upgrade.
Another complication for Georgian is that the primary orthography does not use titlecasing, and the Mkhedruli Georgian letters do not have titlecase mappings to Mtavruli letters. This is unique among bicameral systems in the Unicode Standard, so casing implementations should be prepared for this exception.
Should we treat such changes as semver-major?
I don't know. Is it something that should be fixed upstream in V8?
/cc @nodejs/intl
Hi, I maintain https://github.com/moment/moment , and our builds are breaking because of this issue. Any idea if/when it will be fixed?
@marwahaha what is the breakage?
@targos @marwahaha it's not clear what you mean by 'fixed'. What do you see is the bug here?

But, this is after downloading a font that supports the new characters: https://app.box.com/s/psnogufec39aq486uny1o300j7c6hk76/file/317278048647
https://www.unicode.org/mail-arch/unicode-ml/y2018-m07/0063.html discusses this some.
Here's the problem. The vast majority of Georgian fonts do not yet have the new uppercase characters. So when any system uses case mapping to uppercase text (e.g. browsers interpreting CSSβs text-transform: capitalize), then the users of Georgian will see boxes (βtofuβ) if the font they are using does not have the glyphs.
See a site for example https://bpgfonts.wordpress.com/download/
@vsemozhetbyt
Should we treat such changes as semver-major?
Traditionally we haven't. Functionality isn't removed.
as an example, /\p{Emoji}/u.test('πΉ') ( is Skateboard an emoji?) will also return true for 10.7.0 and false for 10.6.0. (I tried to find a similar case for node 8.9.0 / 8.10.0 which added Unicode 10, such as t-rex π¦ but \p{Emoji} was not turned on then. There probably is some similar example i could do with regex.
(Edit: I don't have a font with the skateboard yet, either.)
I actually think the following may be what is breaking moment.js. Looks like a bug somewhere.
new RegExp('αα₯α’', 'i').test('αα₯α’'.toUpperCase()) // == 'α²α²₯α²’'
β¦Β returns false, but should be true. (breaks in Safari 12.0 also, hm)
Update: filed a v8 bug https://bugs.chromium.org/p/v8/issues/detail?id=8348
Update update: the bug is in moment.
Needs to be new RegExp('αα₯α’', 'ui').test('αα₯α’'.toUpperCase()) // == 'α²α²₯α²’' ( need that 'u' flag)
Like to close this as working as designed. toUpperCase() is working properly for Unicode 11.
@nodejs/v8 if we want to float the eventual v8 fix, will we want a new issue? or repurpose this one? The original issue is not a bug, but correct behavior for toLocaleUpperCase(). However, there's a regex bug.
@srl295 you'd have to eventually make one at https://bugs.chromium.org/p/v8/ so that it could be properly referenced.
@ryzokuken it's https://bugs.chromium.org/p/v8/issues/detail?id=8348
Oh, you mean in Node? I guess you could just backport the commit to master once it's in V8 LKGR. Let me know if you need help with that.
This still seems to fail on Node 10.20.1, although not on 8.17.0.
https://travis-ci.org/github/moment/moment/builds/688016559
@marwahaha The V8 fix (v8/v8@bb24140cb3eef5452e7a74f96a8261f6c049dd02) hasn't been back-ported to V8 6.8, the version that ships with Node.js v10.x. Node.js v12.x contains the fix though, it bundles V8 7.8. (The fix was merged in 7.4 or 7.5.)
v10.x enters maintenance mode tomorrow and it's not a trivial fix (risk of regressions) so I suggest taking no action.
@bnoordhuis i think that sounds right on a short re-read of the bugs.
Thanks, closing then.