This would ideally support users who may use dataverse in a different language or who may enter metadata in a different language and would like that language to be tracked independent from the data or software.
from julian: From what I can tell so far, DataCite 3.1 schema lets you specify the language of Title, Subject and Description with the xml lang attribute (4.1 adds the xml lang attribute to Rights) - https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf. The schema says it accepts only IETF BCP 47 and ISO 639-1 language codes. But I don't think Dataverse knows the ISO language codes for the languages it displays in the Citation block (I vaguely remember a comment about this in a github issue or maybe a Google Group post but can't find it). The Consorcio Madro帽o Dataverse does this with the DataCite metadata they publish for each dataset. Here's an example: https://edatos.consorciomadrono.es/api/datasets/export?exporter=oai_datacite&persistentId=doi%3A10.21950/O53TLR
And most or all of the DDI elements that Dataverse uses can include a lang attribute (http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/field_level_documentation_files/schemas/xml_xsd/attributes/lang.html). Looks like it accepts any value for now.
see related ticket https://github.com/IQSS/dataverse/issues/4633 about adding additional language/ translations for title, subject, and abstract fields in citation block
Thanks Amber! In our emails I was very focused on how to include this new language information in the metadata standards Dataverse uses now. A few other things to consider:
Maybe a "default language for input" variable could be added to the account information for the user.
When specified in the account profile, this language could be pre-selected, when entering metadata within the Citation metadata block, from drop-down menus displayed alongside of those target fields only (Title, Description, Subject, Keyword, Notes).
The user could either change the value, if necessary, or add additional metadata in other languages.
No value would be pre-selected (default choice for drop-down) if default language for input wasn't specified in the user profile.
(I guess this means a value_lang column should be added to the datasetfieldvalue table with ISO 639-1/2/3(?) value)
(example from DSpace interface: https://jira.duraspace.org/secure/attachment/17500/language-tag.png )
I second this request as it is now even more relevant than it used to be for a great number of DDI producers, namely: members of the Consortium of European Social Science Data Archives (CESSDA). The CESSDA Metadata Management (CMM) working group produced guidelines for harmonizing metadata produced by CESSDA members, the Core Metadata Model 20191115_Core_Metadata_Model_v1_0.pdf,
and specifying the language of the content of various metadata fields is mandatory in this DTD.
Most helpful comment
Maybe a "default language for input" variable could be added to the account information for the user.
When specified in the account profile, this language could be pre-selected, when entering metadata within the Citation metadata block, from drop-down menus displayed alongside of those target fields only (Title, Description, Subject, Keyword, Notes).
The user could either change the value, if necessary, or add additional metadata in other languages.
No value would be pre-selected (default choice for drop-down) if default language for input wasn't specified in the user profile.
(I guess this means a value_lang column should be added to the datasetfieldvalue table with ISO 639-1/2/3(?) value)
(example from DSpace interface: https://jira.duraspace.org/secure/attachment/17500/language-tag.png )