The reason is documented in a source comment: https://github.com/angular/angular.js/blob/e5d1d6587d1f38898bca673a829a0bc0ffaaf2c8/src/Angular.js#L161-L163
It's also documented on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/toLocaleLowerCase
Lastly, I think the StackOverflow answers were pretty accurate.
Neither the comment nor the SO answers are informative.
The question is
In which exactly JS engines are toLowerCase() & toUpperCase() locale-sensitive?"
How do those answers answer it? How does the source comment answer it? Moreover, the comment and the answers contradict each other. In theory, toLowerCase() & toUpperCase() shouldn't be locale-sensitive. That's what one of the answers says. And your MDN link says the same thing. But the comment states something very opposite:
String#toLowerCase and String#toUpperCase don't produce correct results in browsers with Turkish locale
Again, in which exactly browsers does this happen?
Searching on Google, this seems like a general problem due to the Turkish language itself - this article seems to explain it: http://www.i18nguy.com/unicode/turkish-i18n.html
If I had to guess, this is a problem with every browser.
The general problem exists, that's true. But who said that it (still) exists also in JavaScript? Who faced it during say last 10 years?
I doubt it was pulled from thin air - that is not how the Angular team functions.
The hacks and workarounds for IE8 are being removed from the code of Angular now. But this thing looks like a workaround for a bug in some even older browser.
@thorn0 why not testing on your end (you seem to be from Turkey) and sending a PR to remove those work-arounds if no longer necessary? I'm pretty sure that everyone would be more than happy to remove unnecessary code.
I wanted to know why this code is there. If the issue still exists (if ever existed) in some browser, it'd be nice to know about it. I'm going to check it in different browsers, but of course I don't have access to all the needed OS+device+browser combinations.
@thorn0 AFAIK Turkish locale was the only / main reason. So if you can confirm that this bug doesn't affect users of modern browsers and submit a PR to see if all the tests are passing on CI this could potentially be merging. But as you are saying, no one will be able to re-test this on all the possible devices in the wild, so we might potentially introduce a breaking change here....
BTW, another library containing code like that is Google Closure Library. See these lines. Supposedly, it came to Angular from there.
I think this can be split into two issues
toUpperCase and toLowerCaseThe former point should be easy to decide if someone can verify that this is no longer an issue with the supported browsers
The later would be hard to remove, but I would be ok to deprecate it (even if it still works)
In case it helps, the code was added to the angular.js file by @mhevery in Oct 2010 as part of this commit "create HTML sanitizer to allow inclusion of untrusted HTML in safe manner".
Everything that I could find on the topic only mentioned this being a problem in Java. The ECMAScript standards (3 and 5) state that the toUpperCase and toLowerCase functions are _not_ locale sensitive and the toLocaleUpperCase and toLocaleLowerCase functions _are_ locale sensitive.
So it doesn't seem to be required, but I don't know how to test on multiple browsers in the Turkish locale.
I think the right thing to do for now would be just undocumenting angular.uppercase and angular.lowercase without removing them.
The basic issue is that when malicious code writes SCRIPT we need to detect it and take it out. The way we do it is to do toLowerCase on it, but that will produce scr谋pt (notice no dot on 谋) in some locales and then 'scr谋pt' === 'script' fails which means that malicious code gets by the sanitizer.
If you can prove that in JS "I".toLowerCase() === "i" true on all locals, than the code is safe to remove. This code was added on request of google security team.
I notice that there are various places in the sanitizer that "don't" use our custom lowercase. E.g. https://github.com/angular/angular.js/blob/master/src/ngSanitize/sanitize.js#L378
Does this need to be fixed?
As far as I can tell, AngularJS and the Closure Library are the only libraries that include this workaround, and nowhere else on the Internet is it mentioned that this problem ever existed in the JS world, only in Java.
Most helpful comment
I doubt it was pulled from thin air - that is not how the Angular team functions.