The Data Curation Tool adds new metadata elements to the DDI XML that Dataverse produces. The variable elements universe and qstn (and sub-elements inside qstn like literal question, interview instructions, and post question) are in this example:
<var ID="v12560" name="var1" intrvl="discrete">
<location fileid="f9842"/>
<labl level="variable">var1</labl>
<universe>This is the universe.</universe>
<sumStat type="mean">5.5</sumStat>
<varFormat type="numeric"/>
<notes subject="Universal Numeric Fingerprint" level="variable" type="Dataverse:UNF">UNF:6:8OuUohmPouAgl2L3IrfN2A==</notes>
<notes>
<![CDATA[ These are the notes for var1, v12560. ]]>
</notes>
<qstn>
<qstnLit>Is this a literal question?</qstnLit>
<ivuInstr>These are instructions.</ivuInstr>
<postQTxt>Is this the post question?</postQTxt>
</qstn>
</var>
The DDI schema definition (view-source:https://ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd) says that the <qstn> element and its sub elements are out of order here, and should come before <universe>.

So it should be:
<var ID="v12560" name="var1" intrvl="discrete">
<location fileid="f9842"/>
<labl level="variable">var1</labl>
<qstn>
<qstnLit>Is this a literal question?</qstnLit>
<ivuInstr>These are instructions.</ivuInstr>
<postQTxt>Is this the post question?</postQTxt>
</qstn>
<universe>This is the universe.</universe>
<sumStat type="mean">5.5</sumStat>
<varFormat type="numeric"/>
<notes subject="Universal Numeric Fingerprint" level="variable" type="Dataverse:UNF">UNF:6:8OuUohmPouAgl2L3IrfN2A==</notes>
<notes>
<![CDATA[ These are the notes for var1, v12560. ]]>
</notes>
</var>
@lubitchv, do you think this validation problem could create interoperability issues with other systems that expect variable metadata that follows the DDI schema (maybe issues with importing or exporting metadata into or out of tools like Colectica)? As with other DDI export validation issues, I'm not sure how severe this validation error is. It's the first I've seen for the variable level metadata. Please feel free to close this GitHub issue if it's minor. :)
@jggautier Thanks for spotting this validation problem. I do not know if it will create issues with other tools, I did not test it with such tools. But this issue (the order of fields in xml) should not be difficult to fix and I will definitely look into it.
Thanks @lubitchv !
@jggautier What tools do you use for validating XML. I want to fix this issue.
Hi @lubitchv. I use a website at https://www.freeformatter.com/xml-validator-xsd.html for validating XML against schemas. Not sure if it's the best tool, but it's free. Please let me know if you have any other questions
Thanks @jggautier . I tried to see how it works on dataverse ddi xml, so I created a dataset on demo https://demo.dataverse.org/dataset.xhtml?persistentId=doi%3A10.70122%2FFK2%2F0QMKVB
I downloaded DDI metadata and tried to validate it using that tool. I get an error White Spaces Are Required Between PublicId And SystemId., Line '1', Column '50'. I am probably doing something wrong.
That error is because the exported XML is still pointing to the wrong DDI schema URL, so you could change:
<codeBook xmlns="ddi:codebook:2_5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" version="2.5">
to
<codeBook xmlns="ddi:codebook:2_5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:codebook:2_5 https://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" version="2.5">
(This is fixed in https://github.com/IQSS/dataverse/issues/6553 for Dataverse 4.20.)
Once the right schema location is being referenced, you'll see a bunch of other errors as well, many of them also being fixed in 4.20 (https://github.com/IQSS/dataverse/issues/6650).
Would it be easier to use the unreleased 4.20 version of Dataverse to export the DDI and test with that, so it's easier to spot the validation errors regarding the variable metadata?
You are right, I should use the current develop branch that contains all these fixes. So I tried the new develop and got a bunch of errors such as
Cvc-enumeration-valid: Value 'DVN' Is Not Facet-valid With Respect To Enumeration '[archive, Producer]'. It Must Be A Value From The Enumeration., Line '1', Column '446'.
Cvc-attribute.3: The Value 'DVN' Of Attribute 'source' On Element 'verStmt' Is Not Valid With Respect To Its Type, '#AnonType_sourceGLOBALS'., Line '1', Column '446'.
I guess I should work through them to see what is going on.
Yes, that's one of the errors that wasn't fixed in https://github.com/IQSS/dataverse/issues/6650 because it didn't affect that issue's main goal of improving importing/exporting DDI xml. I called those errors "noise" for this issue because they're not errors that have to do with the variable level metadata, so I hoped you could ignore it, unless you'd really like to fix it, which would be great, but I think difficult.
In another GitHub issue I found and suggested fixes for all of these errors: https://github.com/IQSS/dataverse/issues/3648#issuecomment-592746241. There's a Google Doc describing the trickier issues and example valid DDI XML files.
Edit: I forgot to mention that Google doc doesn't include the fixes in made in Dataverse 4.20. I planned to update it sometime after 4.20 is released on Demo Dataverse. But hopefully it and the example DDI xml files are helpful places to start...
Related DDI XML improvements in the PR 3648 Change "DVN" to default value "producer" and only write the firs… #7094.
Most helpful comment
@jggautier Thanks for spotting this validation problem. I do not know if it will create issues with other tools, I did not test it with such tools. But this issue (the order of fields in xml) should not be difficult to fix and I will definitely look into it.