The following problem does not occur in Pandoc 2.2.1 and occurs in all recent versions starting with Pandoc 2.2.2.
test.yml contains:
---
reason: 'Was geht?
'
---
This is how libyaml embedded by the Ruby programming language outputs strings with trailing newline "\n". The file can be produced with this ruby command:
ruby -r yaml -e 'puts Hash({"reason" => "Was geht?\n"}).to_yaml + "---"'
In my actual setup, I generate meta-data and use this in a document. For the minimal example, I just output the meta-data in JSON AST format.
pandoc test.yml -t json
The output with pandoc 2.2.1 is:
pandoc-2.2.1/bin/pandoc test.yml -t json
{"blocks":[],"pandoc-api-version":[1,17,4,2],"meta":{"reason":{"t":"MetaBlocks","c":[{"t":"Plain","c":[{"t":"Str","c":"Was"},{"t":"Space"},{"t":"Str","c":"geht?"}]}]}}}
Pandoc 2.2.2 and higher gives a different output:
pandoc-2.2.2/bin/pandoc test.yml -t json rriemann@mars
[WARNING] Could not parse YAML metadata at line 1 column 1: :2:18: Unexpected '
'
{"blocks":[],"pandoc-api-version":[1,17,5,1],"meta":{}}
As you can see, the meta data is empty.
The cause is certainly linked to the dependency change to HsYAML from @hvr, that I kindly ask to help determining if the file test.yml is actually supported syntax.
HsYAML claims to comply strictly with YAML 1.2, so the
first thing you should do is check whether your sample
conforms to that spec. It's possible that it does
not. If it does, then you should report a bug to HsYAML.
If not, then I don't consider this a bug at all.
Testing directly with HsYAML:
Data.YAML> decodeNode' failsafeSchemaResolver False False (fromStringLazy "foo: 'hi\n'")
Left ":1:8: Unexpected '\n'"
Data.YAML> decodeNode' failsafeSchemaResolver False False (fromStringLazy "foo: 'hi\n '")
Right [Doc (Mapping Nothing (fromList [(Scalar (SUnknown Nothing "foo"),Scalar (SStr "hi "))]))]
Data.YAML> decodeNode' failsafeSchemaResolver False False (fromStringLazy "'hi\n'")
Right [Doc (Scalar (SStr "hi "))]
So the Ruby lib is based on a C lib libyaml that does not support YAML 1.2 yet.
Upstream Bug report: https://github.com/yaml/libyaml/issues/20
I could not find out whether my test file is YAML 1.2 compliant.
I don't think there's much more we can do about this on the pandoc side. If you find there's a bug in HsYAML, you should report there.
I'm pretty confident that
- 'Was geht?
'
or
reason: 'Was geht?
'
are in fact not valid YAML 1.2
If you look at section 7.3.2. Single-Quoted Style, you'll notice that the rules
[123] nb-ns-single-in-line ::= ( s-white* ns-single-char )*
[124] s-single-next-line(n) ::= s-flow-folded(n) ( ns-single-char nb-ns-single-in-line ( s-single-ext-line(n) | s-white* ) )?
[125] nb-single-multi-line(n) ::= nb-ns-single-in-line ( s-single-next-line(n) | s-white* )
all have a n parameter which is used to keep track of the relative indentation level to encode the general rule that nodes must be indented one bit more than the block node they're contained in. And in particular, the s-flow-folded(n) production enforces leading indentation before non-space content of amount n.
And as such, if e.g. - (yaml sequence indicator) is at n = 0, then the single-quoted scalar inside that block collection is e.g. at least at level n = 1.
PS: As it turns out, there's a negative test in the YAML testsuite at http://matrix.yaml.io/sheet/invalid.html#QB6E which expects a compliant YAML parser to fail on
---
quoted: "a
b
c"
Thanks for telling us @hvr.
I just report here for those running into similar issues. I used the YAML 1.2 compliant lib ruamel.yaml to find out the YAML 1.2 compliant fix for the example meta data file. One solution (maybe there are others) is:
---
reason: "Was geht?\n"
---
What is different?
My solution is to produce my file with Ruby and then fix this one problem manually with regular expressions. Of course, with a different feature set used in the YAML file, other problems may occur that also need manual treatment. So I hope that in the long run, a YAML 1.2 compliant Ruby lib becomes available.
# fix YAML 1.2 compatibility for pandoc > 2.2.1, see https://stackoverflow.com/a/30049447/1407622
sed -r -z -i "s/: '([^']+)\n\n'/: \"\1\\\n\"/g" test.yml
Most helpful comment
I'm pretty confident that
or
are in fact not valid YAML 1.2
If you look at section 7.3.2. Single-Quoted Style, you'll notice that the rules
all have a
nparameter which is used to keep track of the relative indentation level to encode the general rule that nodes must be indented one bit more than the block node they're contained in. And in particular, thes-flow-folded(n)production enforces leading indentation before non-space content of amountn.And as such, if e.g.
-(yaml sequence indicator) is atn = 0, then the single-quoted scalar inside that block collection is e.g. at least at leveln = 1.PS: As it turns out, there's a negative test in the YAML testsuite at http://matrix.yaml.io/sheet/invalid.html#QB6E which expects a compliant YAML parser to fail on