Pandoc: With --listings, use verbatim for code blocks without a language attribute

Created on 1 Feb 2016 · 8Comments · Source: jgm/pandoc

Currently, when --listings is used, all code blocks are translated to LaTeX as lstlisting environments, even if no language attribute is specified. In my opinion, it would make more sense, in the latter case, to use verbatim instead. In other words, the following:

``````

some text

``````

should be translated into

\begin{verbatim}
some text
\end{verbatim}

and not

\begin{lstlisting}
some text
\end{lstlisting}

Btw, this is how MultiMarkdown does it.

Source

lifepillar

Most helpful comment

I often write papers using code blocks in different languages.
The method I settled on is to set a default style like this:

\lstset
    { language=C
    , basicstyle=\fontsize{8}{10}\fontencoding{T1}\ttfamily
    , keepspaces=true
    , showspaces=false
    , showstringspaces=false
    , breaklines=true
    , frame=tb
    }

This style applies to all code blocks that do not supply a language attribute. For any other languages I supply a language attribute.

By setting the default style, there is no need for \lstset before each code block. For code blocks with an explicit language attribute the correct style will be applied and then it will switch back to the default style.

By using listings for all code blocks / verbatim code, I get consistent styles and syntax highlighting, even in inline code blocks.

If you are not satisfied with a default style, it is possible to use a filter to apply an "implicit" style to all code blocks that does not have an explicit attribute.

cc @emilaxelsson, @josefs

emwap on 10 Feb 2016

❤3

All 8 comments

Just to clarify why I think that it would be better: with the current translation, if I want some text to be typeset as if verbatim were used, I have to define a language along these lines:

\lstdefinelanguage{text}{%
  basicstyle=\ttfamily,%
  columns=fullflexible,%
  extendedchars=true,%
  inputencoding=utf8,%
  showstringspaces=false%
}

then write something like this in my Markdown document:

``````
lstset{language=text}

some text

``````

I need to pollute my document with \lstset because text is not a recognized language (this is another pain point I have, but I have seen that it has been discussed several times in the past).

lifepillar on 1 Feb 2016

👍1

I'd be interested in hearing from other users of --listings on this.

jgm on 10 Feb 2016

I often write papers using code blocks in different languages.
The method I settled on is to set a default style like this:

\lstset
    { language=C
    , basicstyle=\fontsize{8}{10}\fontencoding{T1}\ttfamily
    , keepspaces=true
    , showspaces=false
    , showstringspaces=false
    , breaklines=true
    , frame=tb
    }

This style applies to all code blocks that do not supply a language attribute. For any other languages I supply a language attribute.

By using listings for all code blocks / verbatim code, I get consistent styles and syntax highlighting, even in inline code blocks.

If you are not satisfied with a default style, it is possible to use a filter to apply an "implicit" style to all code blocks that does not have an explicit attribute.

cc @emilaxelsson, @josefs

emwap on 10 Feb 2016

❤3

For example, here is a small filter that will turn code blocks without language attributes into verbatim environments.

#!/usr/bin/env runhaskell
import Text.Pandoc
import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter verbatim

verbatim :: Block -> Block
verbatim = bottomUp verbatimInline . bottomUp verbatimBlock

mkVerbatim :: String -> String
mkVerbatim s = unlines ["\\begin{verbatim}", s, "\\end{verbatim}"]

verbatimBlock :: Block -> Block
verbatimBlock (CodeBlock attr@(_,[],_) code) = Div attr [RawBlock (Format "latex") $ mkVerbatim code]
verbatimBlock x = x

verbatimInline :: Inline -> Inline
verbatimInline (Code attr@(_,[],_) code) = Span attr [RawInline (Format "latex") $ concat ["\\verb{", code, "}"]]
verbatimInline x = x

Save it to a file verbatim.hs and add --filter verbatim.hs to your pandoc command line.

The filter preserves other attributes in Divs and Span so that labels and similar things still work.

emwap on 10 Feb 2016

On 10 February 2016 at 06:03, John MacFarlane wrote:

I'd be interested in hearing from other users of --listings on this.

Using lstlisting environment everywhere has the advantage that
everything looks the same (line numbers, line wrapping, etc.). Would
different environment looks the same? Would it support all that the
listings package supports?

wilx on 10 Feb 2016

This seems an excellent solution, more flexible
than using verbatim environments for code without a
language. So I propose that this issue be closed.

+++ Anders Persson [Feb 10 16 05:37 ]:

I often write papers using code blocks in different languages.
The method I settled on is to set a default style like this:

lstset
{ language=C
, basicstyle=\fontsize{8}{10}\fontencoding{T1}\ttfamily
, keepspaces=true
, showspaces=false
, showstringspaces=false
, breaklines=true
, frame=tb
}

This style applies to all code blocks that do not supply a language
attribute. For any other languages I supply a language attribute.

By setting the default style, there is no need for lstset before each
code block. For code blocks with an explicit language attribute the
correct style will be applied and then it will switch back to the
default style.

By using listings for all code blocks / verbatim code, I get consistent
styles and syntax highlighting, even in inline code blocks.

If you are not satisfied with a default style, it is possible to use a
filter to apply an "implicit" style to all code blocks that does not
have an explicit attribute.

jgm on 10 Feb 2016

Ok for me. I'll use one of the methods described by @emwap.

lifepillar on 10 Feb 2016

Hopefully this will help someone who comes across this issue, but @emwap's code doesn't work with at least Pandoc 2.9.2, since Pandoc types now seem to require Data.Text types instead of Strings.

The following modifications work for me:

#!/usr/bin/env runhaskell

import qualified Data.Text as Text
import Text.Pandoc
import Text.Pandoc.JSON

main :: IO ()
main = toJSONFilter verbatim

verbatim :: Block -> Block
verbatim = bottomUp verbatimInline . bottomUp verbatimBlock

mkVerbatim :: Text.Text -> Text.Text
mkVerbatim s = Text.unlines [Text.pack("\\begin{Verbatim}[fontsize=\\small]"), s, Text.pack("\\end{Verbatim}")]

verbatimBlock :: Block -> Block
verbatimBlock (CodeBlock attr@(_,[],_) code) = Div attr [RawBlock (Format . Text.pack $ "latex") $ mkVerbatim code]
verbatimBlock x = x

verbatimInline :: Inline -> Inline
verbatimInline (Code attr@(_,[],_) code) = Span attr [RawInline (Format . Text.pack $ "latex") $ Text.concat [Text.pack("\\verb|"), code, Text.pack("|")]]
verbatimInline x = x