It's a bit unclear to me on the language resolution. Looking at the spec, it has:
Pages can be written in multiple languages. If a client has access to environment variables, several standard ones exist to specify the language in which a client should operate. If not, then clients MUST make reasonable assumptions based on the information provided by the environment in which they operate (e.g. consulting navigator.languages in a browser, etc.). If possible, it is RECOMMENDED to also make language configurable, as to not only rely on the environment. Clients SHOULD therefore offer options to configure or override the language using configuration files or command line options (like -L, --language as suggested in the arguments section above).
The LANG environment variable, if present, MUST be used to determine the language of pages to display.
From this, there's some amount of ambiguity (to me) on how clients should handle when LANG is set to a language in which a page does not exist (or even any pages exist). From a strict interpretation, it would mean that if LANG=cz for example, then no pages would get shown, unless they configured the client in some other way (e.g. through --language flag). However, a looser intrepretation (which the node client takes), is to always check if the page exists in English, after checking other languages, even if explicitly using the --language flag. For example, the following two return the same english page:
LANG=fr node bin/tldr ansible galaxy
node bin/tldr --language fr ansible galaxy
It would be good to clarify that either clients should always fallback to english if a page does not have a page in the requested locale, or vice versa, as currently, it leaves room for clients to have different behavior imo.
Personally, I think the clients should always fallback to English given how sparse the translations are, and would suggest adding the following text to the section:
Regardless of selected language through environment variables, tldr should always attempt to fallback to English if the page does not exist in the requested languages. It is SUGGESTED that the client tell the user that the page does not exist in their requested language however if it was not English. If the client supports a command line argument for language, the client MUST only attempt to show the page in that language.
Hey there! That's a great observation. I kind of assumed that people would have English at least somewhere in their LANG setting - but in hindsight that's a silly assumption. That wording sounds ok. What about this though?
Regardless of selected language through environment variables, clients SHOULD always attempt to fallback to English if the page does not exist in the requested languages. It is SUGGESTED that the client tell the user that the page does not exist in their requested language however if it was not English. If the client supports a command line argument for language, the client MUST only attempt to show the page in that language (though it is SUGGESTED that clients notify the user that a page is available in other languages if present).
@sbrl I edited your comment to make your suggestion more readable. The text seems alright.
Opened #4101
My intention with the first sentence was to mandate the fallback to English, and would advocate heavily for upgrading "SHOULD" to "MUST" (I realize I used should in my original draft, apologies).
What about using the environment variables LANG and LANGUAGE from https://www.gnu.org/software/gettext/manual/html_node/Locale-Environment-Variables.html#Locale-Environment-Variables with english as an fallback and the LANGUAGE as prioritylist, as LANG should only hold one language. Additional TLDR_LANGUAGE could be used to specify explicit a language like the --language flag.
Example
| TLDR_LANGUAGE | LANGUAGE | LANG | Result |
| --- | --- | --- | --- |
| fr | it,de,fr | cz | fr |
| ! | it,cz,fr | cz | it,cz,fr,en |
| ! | it,cz,fr | de | it,cz,fr,de,en |
| ! | ! | it | it,en |
| ! | it,cz | ! or 'C' | en |
| ! | ! | ! | en |
To be full complient LC_ALL could be recognized to overwrite LANG and LANGUAGE.
I would consier TLDR_LANGUAGE environment variable (or a value in a config file in the case of the node client) to be equivalent to LANG, not the command line option --language. Looking at the table, I would agree that it would be good to include one in the PR itself.
So TLDR_LANGUAGE first, then LANGUAGE and LANG, as described at gnu.org and english as fallback at last?
Example
| TLDR_LANGUAGE | LANGUAGE | LANG | Result |
| --- | --- | --- | --- |
| fr | it,de,fr | cz | fr,it,de,fr,cz,en |
| fr | ! | it | fr,it,en |
| fr | it,cz | ! or 'C' | fr,en |
| ! | it,cz,fr | cz | it,cz,fr,en |
| ! | it,cz,fr | de | it,cz,fr,de,en |
| ! | ! | it | it,en |
| ! | it,cz | ! or 'C' | en |
| ! | ! | ! | en |
,while ! meas not set.
@columbarius We already specify that clients should use the LANG and LANGUAGE environment variables though?
@MasterOdin: I'm not sure about the idea of a custom environment variable - I'd prefer to stick to well-established standards.
Agreed. Given the above changes to always fallback to english, I would be happy to remove it from the python client. It existed solely for the case of if LANG and LANGUAGE were set such that they did not include english at all or worse, only set to languages that were totally unsupported by tldr, which would render the whole client unusable unless TLDR_LANGUAGE was set.
@columbarius We already specify that clients should use the
LANGandLANGUAGEenvironment variables though?
Ok, no problem. I just tried to use, what was implemented in the python client without looking at other clients, trying to avoid breaking any established option. And i also created the table to clarify the priority of LANG and LANGUAGE since I saw diverging behaviour from the documentation of gnu gettext. I don't know, if there are different ways of handling those? I just found this documentation and made the table to clarify it, in case it could be useful.
Ah, I see! I was a bit confused by the table at first @columbarius. The intention is to abide by the existing GNU spec, and avoid diverging from it too much.
Closing this as #4101 was merge! 馃帀