Sphinx: Parsed-literals don't wrap very long lines with pdf builder

Created on 6 Jan 2017  路  8Comments  路  Source: sphinx-doc/sphinx

Subject: Parsed-literals don't wrap very long lines with pdf builder

Problem

I'm using the parsed-literal directive to display code blocks, and the very long lines don't wrap they way they would in a code directive. [For context: I have to use parsed-literal because there are quite a few substitutions in each listing, eg |apiurl|].

It seems this problem may be related to #3306, #2167, and #2343.

Environment info

  • OS: macOS 10.12.1
  • Python version: 2.7.10
  • Sphinx version: 1.5.1
  • pdflatex version: pdfTeX 3.14159265-2.6-1.40.17 (TeX Live 2016)
bug latex

Most helpful comment

Thanks a bunch! That PR fixes it for me.

All 8 comments

I am not sure we should call this a Sphinx bug... it is in nature of TeX rather to make wrapping of lines very difficult (except at spaces). TeX was thought out for wrapping lines of natural language. It requires lots of motivation on parts of user to get the kind of wrapping lines like currently achieved by Sphinx in code-blocks... (for example listings package needs to retokenize entirely the TeX source, character by character).

@spencer-rig can you try adding this to your conf.py

latex_elements = {
    # The paper size ('letterpaper' or 'a4paper').
    #
    # 'papersize': 'letterpaper',

    # The font size ('10pt', '11pt' or '12pt').
    #
    # 'pointsize': '10pt',

    # Additional stuff for the LaTeX preamble.
    #
    # 'preamble': '',

    # Latex figure (float) alignment
    #
    # 'figure_align': 'htbp',
    'preamble': r'''
\makeatletter
\newcommand*\allowwrappedlinesinparsedliteral{%
  \def\do##1##2%
     {\def##1{\discretionary{}{\sphinxafterbreak\char`##2}{\char`##2}}}%
  \do\{\{\do\textless\<\do\#\#\do\%\%\do\$\$% {, <, #, %, $
  \def\do##1##2%
     {\def##1{\discretionary{\char`##2}{\sphinxafterbreak}{\char`##2}}}%
  \do\_\_\do\}\}\do\textasciicircum\^\do\&\&% _, }, ^, &,
  \do\textgreater\>\do\textasciitilde\~% >, ~
  \def\do##1{\lccode`\~`##1%
    \lowercase{\def~}{\discretionary{\char`##1}{\sphinxafterbreak}{\char`##1}}%
    \catcode`##1\active}%
  \do\-\do\.\do\,\do\;\do\?\do\!\do\/%
  \lccode`~32 \lowercase{\def~}{%
       \nobreak\hskip\z@ plus\fontdimen3\font minus\fontdimen4\font
       \discretionary{\copy\sphinxvisiblespacebox}{\sphinxafterbreak}
                     {\kern\fontdimen2\font}%
       }\lccode`\~`\~%
}%
\let\originalalltt\alltt
\let\originalendalltt\endalltt
\renewenvironment{alltt}
{\let\originalurl\url\def\url##1{\originalurl{\detokenize{##1}}}%
 \originalalltt\sbox\sphinxcontinuationbox {\spx@opt@verbatimcontinued}%
               \sbox\sphinxvisiblespacebox {\spx@opt@verbatimvisiblespace}%
 \allowwrappedlinesinparsedliteral}
{\originalendalltt}
\makeatother
'''
}

edit: I removed \let\url\relax which was a debugging left-over in the \renewenvironment{alltt} code.

I needed to patch \url which is inserted by Sphinx on finding an URL in the parsed-literal contents, but it has issue #3317 which would get only worse in the present context where more characters are made active: - . , ; ? ! /

Working about this I realized that Sphinx's parsed-literal transforms " (the double-quote) into two single quotes ''. This is unfortunate for literal html code.

The \def\url##1{\originalurl{\detokenize{##1}}} trick is no good for characters like _ which gets output as \_ in TeX source, with the \detokenize we end up with \_ in PDF.

The trick is needed for active characters. I need to think about it, but currently the best I see is that when Sphinx inserts the \url (inside the contents of parsed-literal directive) it should turn off the LaTeX escapes, because \url does not need LaTeX escapes, but wrap the argument in \detokenize to work around issues from active characters.

edit PR #3340 has a better approach.

@tk0miya I don't have a clear view of the kind of LaTeX mark-up which can end up in output of LaTeX writer from parsed-literal directive (because I don't get a clear understanding from docutils parsed-literal directive description.) I see we can have styling mark-up (italic, boldface, ...) and also I know about \url macro possibly showing-up there. Apart from \url I am not sure if other LaTeX macros can end up there, which would be broken or altered by active - . , ; ? ! /. I can concentrate on the \url macro and make a PR sometime this week. Do you feel an option should be added to make wrapping lines customizable ?

I have thought a bit better on the \url issue and put together a PR #3340.

@spencer-rig: if you have some big project with lots of uses of parsed-literal, could you try out this PR and check it doesn't create problems ?

Surely, parsed-literal is very complex element because users can customize its contents.
We can pack all inline elements into parsed-literal through mark-ups and substitutions.

.. parsed-literal::

   We can use *emphasis*, **strong** and ``literal`` notation.
   Also able to use urls, reference_, :ref:`roles`, footnote refs [1]_ and so on

   With substitution, we can also use |images|, |raws|, |maths| and etc.

I feel it's a bit programable. So I think it is hard to convert LaTeX code perfectly.

@tk0miya thanks! actually based on this, investigating at https://github.com/sphinx-doc/sphinx/pull/3340#issuecomment-272658383 I found that there are issues with current (1.5.1) stable branch: footnotes, and also inline math in parsed-literal. There is also #3317.

It would be possible to extract from PR #3340 the fixes to these problems as an independent PR, easier to evaluate because not introducing new things like active characters.

I have now merged PR #3340 for 1.5.2 release. I will close this but please re-open in case of problems. I don't have a big enough database of parsed-literal use cases.

Thanks for reporting !

Thanks a bunch! That PR fixes it for me.

Was this page helpful?
0 / 5 - 0 ratings