_From @CodingDinosaur on October 12, 2018 3:30_
When building or running under .NET Core on a Unix-based environment, certain cultures cannot be utilized for resource localization, such as getting localized strings. This impacts both the build process (e.g., identifying and processing resource files) and the lookup of resources at runtime. These cultures can be used as expected when building or running under Windows.
The affected cultures are those which are aliased by ICU -- that is, to save on DB space for certain cases, ICU defines some locales as an "alias" of another. There are 42 locale aliases in ICU 57, of those two of the most common are zh-TW and zh-CN. For a full list of affected locales, see: ICU Aliased Locale List - CultureIssueDemonstration Readme.
This platform-inconsistent behavior when trying to localize certain resources, necessitates using special code and workarounds both at build and deploy time when developing cross-platform applications.
See a demo of this issue in CodingDinosaur/CultureIssueDemonstration
Most of the above symptoms boil down to uloc_getAvailable in ICU's C API.
For example, the zh-TW resource files do not get copied during build, because during the task SplitResourcesByCulture, the culture is validated against a cache based ultimately on CultureInfo.GetCultures, which in turn, on Unix, ultimately relies on ICU. A diagnostic MSBuild log shows why the file is missing:
Removed Item(s):
_MixedResourceWithNoCulture=
Resources/MyNetCoreProject.MyResources.zh-TW.resx
OriginalItemSpec=Resources/MyNetCoreProject.MyResources.zh-TW.resx
TargetPath=Resources/MyNetCoreProject.MyResources.zh-TW.resx
WithCulture=false
From which we can follow back to the offending path:
Microsoft/msbuild/src/Tasks/Microsoft.Common.CurrentVersion.targets - SplitResourceByCulture ->
Microsoft.Build.Tasks.AssignCulture.Execute ->
Microsoft.Build.Tasks.Culture.GetItemCultureInfo ->
Microsoft.Build.Tasks.CultureInfoCache.IsValidCultureString ->
Microsoft.Build.Shared.AssemblyUtilities.GetAllCultures ->
CultureInfo.GetCultures ->
CultureData.GetCultures ->
CultureData(Unix).EnumCultures ->
System.Globalization.Native/locale.cpp:GlobalizationNative_GetLocales
https://github.com/dotnet/coreclr/blob/8ba838fb54d6c07271d026b2d77bedcb9e2a786a/src/corefx/System.Globalization.Native/locale.cpp#L162-L171
ICU does not return aliases when getting a list of locales -- whether with uloc_getAvailable or Locale::getAvailableLocales (and uloc_countAvailable does not include them in its count).
That ICU does not return the aliases in this manner appears to be intentional, both based on the numerous references to a lack of alias mapping in the uloc documentation, and the following bug:
https://unicode-org.atlassian.net/browse/ICU-4309
uloc_getAvailable returns sr_YU, even though it is an %%ALIAS locale. None of the other %%ALIAS locales are returned.
TracBot made changes - 01/Jul/18 1:59 PM
Resolution Fixed [ 10004 ]
Status Done [ 10002 ] Done [ 10002 ]
ICU-4309 was fixed via: https://github.com/unicode-org/icu/commit/ab68bb319601bc467784dcbdcc6d52131a2863d2
Which seems to further indicate that ICU not returning aliases when calling uloc_getAvailable is intentional.
A full analysis can be seen in the test repo README: CodingDinosaur/CultureIssueDemonstration
I have two test repos that help demonstrate this issue:
_Copied from original issue: dotnet/coreclr#20388_
Thanks @CodingDinosaur for reporting the issue and listing the details. this is very helpful. we'll take a look.
@cdmihai it is wrong to have msbuild depends on only the list returned from CultureInfo.GetCultures. this was ok before Windows 10 and Linux support but now it is not valid approach.
for example, in Windows 10, if you try to create any culture which Windows doesn't have data for, the operation still succeed and the culture can be created as long as the culture name conform to BCP-47 spec.
I understand may be msbuild doing that for perf reason which can be kept but will need to add extra case when failing finding any culture in the list, try to call CultureInfo.GetCultureInfo and find out if can create the culture.
We'll try to look how we can enhance the support for aliased culture as this issue suggested but whatever we do here, msbuild will need to do something more. do you want me open a new issue in msbuild to track that?
_From @cdmihai on October 15, 2018 18:23_
@tarekgh would this be a suitable issue? https://github.com/Microsoft/msbuild/issues/1454
Regarding removing valid locale checks, the biggest issue with this is that it would be a breaking change for MSBuild. The msbuild repo build logic itself has the assumption that non-existing locales are rejected, so a lot of strings are put in Microsoft.shared.resx. If we remove the locale check in the SplitResourceByCulture, then the repo fails building with some invalid locale error (fuzzy memory from ~2 years ago). I have no data on this, but this could also break a lot of other existing repos.
FYI @rainersigwald
Regarding removing valid locale checks, the biggest issue with this is that it would be a breaking change for MSBuild. The msbuild repo build logic itself has the assumption that non-existing locales are rejected, so a lot of strings are put in Microsoft.shared.resx. If we remove the locale check in the SplitResourceByCulture, then the repo fails building with some invalid locale error (fuzzy memory from ~2 years ago). I have no data on this, but this could also break a lot of other existing repos.
I am not sure I understand the breaking scenario here. The scenario you are describing can occur today. for example build some project with some culture introduced in Windows 10 and then run on down-level platform don't have this culture.
_From @cdmihai on October 15, 2018 20:40_
True, but that's the class of valid locales introduced in different windows versions. The new breaking scenario is for the class of always invalid locales, that were never meant to be locales, and users expect msbuild to not treat them as locales.
The new breaking scenario is for the class of always invalid locales, that were never meant to be locales, and users expect msbuild to not treat them as locales.
I am not sure I agree with that. If the OS/.Net can create a culture for these, then those should be a valid locales to use. Why you think msbuild should reject such cultures?
_From @cdmihai on October 15, 2018 23:16_
Personally I agree that msbuild should not care and just do what the OS does, but I fear there are actual customers who depend on this behaviour, and changing this might break them. But this is just gut feeling based on the fact that the msbuild repo itself is doing it, and I don't have actual data on it. Alternatively we can only enable it in .net core msbuild, and then customers will have to opt-in to the break by transitioning to .net core. But it's not nice to diverge behaviour.
The only breaking scenario I can think of is when we allow creating resources with a culture not returned by CultureInfo.GetCultures and then move this resources to other machine which cannot understand the used culture. This scenario can happen today anyway. do you have any breaking scenario you can think of?
Microsoft/msbuild#1454 is specific to custom cultures. I would suggest updating it to include the other supported system cultures which is not enumerated by CultureInfo.GetCultures
After looking at this issue, it looks ICU not enumerating the aliased cultures for good reasons. I believe the framework should follow that too and not enumerate such aliased cultures. The framework still can create such aliased culture if anyone want to use them. e.g.:
C#
new CultureInfo("zh-TW");
will work fine.
Considering that, I believe the resource issue should be fixed from msbuild side.
msbuild should not depend on the enumerated list only not because of aliased cultures only but also for supporting the behavior of Windows 10 which can create any culture as long as the used name is conforming to the BCP-47 specs.
I am going to move this issue to msbuild repo.
@CodingDinosaur thanks again for reporting this issue.
Is there any update to this issue, or at least a workaround?
My understanding is, that it's currently not possible to have an ASP.NET Core application localized with zh- cultures on Linux, which seems like a pretty common use case.
Most helpful comment
Is there any update to this issue, or at least a workaround?
My understanding is, that it's currently not possible to have an ASP.NET Core application localized with
zh-cultures on Linux, which seems like a pretty common use case.