For quotes that span multiple paragraphs in English, it's conventional to put an opening quotation mark at the start of each paragraph, and only put a closing quotation mark at the end of the final paragraph.
Pandoc's smart extension will only convert the final opening and closing quotation marks.
echo "\"This is a quote. \n\n \"It spans paragraphs.\"" | pandoc --from=markdown+smart --ascii
--ascii isn't necessary, but it makes the difference in output more obvious.
<p>“This is a quote.</p>
<p>“It spans paragraphs.”</p>
<p>"This is a quote.</p>
<p>“It spans paragraphs.”</p>
pandoc 2.10.1
Compiled with pandoc-types 1.21, texmath 0.12.0.2, skylighting 0.8.5
The multi-paragraph thing is kind of a red hearing, you can replicate this in a single line with an unbalanced pairing. Single quotes get smart treatment without being matched, but double quote marks go all screwy if not matched.
In addition to the matching problem, there is also a weird edge case with spaces. Note how this (arguably wrong but technically valid) input gets changed to drop a space:
$ echo "It \"spans \"paragraphs." | pandoc --from=markdown+smart --ascii
<p>It “spans”paragraphs.</p>
For what it's worth, here is a small Lua filter to get the expected result.
function Para (p)
local first = p.content[1]
if first and first.t == 'Str' and first.text:sub(1, 1) == '"' then
p.content[1] = pandoc.Str('“' .. first.text:sub(2))
return p
end
end
@tarleb Thanks! That script seems to work perfectly for my purposes.
It would still be nice if pandoc is able to support this out-of-the-box. Maybe an optional flag that causes all " and ' to be converted, regardless of if they have a corresponding opening/closing quotation in the same paragraph? Whether they turn into ” or “ could depend on if whitespace comes before or after them.
If it’s not feasible to add, perhaps the documentation should at least explain this limitation and its workarounds. I suspect that my situation isn’t a unique edge case.
The commonmark reader is smarter about this
% echo "\"This is a quote. \n\n \"It spans paragraphs.\"" | pandoc -f commonmark+smart --ascii
<p>“This is a quote.</p>
<p>“It spans paragraphs.”</p>
Since we'll eventually be transitioning to that for the default markdown reader, I'm going to close this.
Most helpful comment
For what it's worth, here is a small Lua filter to get the expected result.