A seemingly simple problem: how do I insert page breaks at arbitrary points in the rst document that is passed to rinoh through Sphinx as the front end. My search-fu comes up dry since Sphinx is apparently a document processor that doesn't understand the need for a common way to express page breaks (I find that hilarious and saddening at once - stuff that is pretty much trivial in any other text "processing" system, even ancient ones)...
There are two ways to insert page breaks:
page_break style attribute. Since v0.4.3.dev1, page_break can be set on any flowable, not just sections. To insert a page break at an arbitrary point, add a class to a directive by setting the :class: attribute, or using the (rst-)class directive. The page break will be inserted before the corresponding element. Assuming v0.4.3.dev1:**your reStructuredText file**:
```rst
.. image:: images/screenshot.png
:class: page-break
A regular paragraph.
.. rst-class:: page-break
This paragraph will trigger a page break.
```
[**your custom style sheet**](http://www.mos6581.org/rinohtype/basicstyling.html#extending-an-existing-style-sheet):
```ini
[page-break-paragraph : Paragraph(has_class="page-break")]
base = default
page_break = any
[page-break-image : Image(has_class="page_break")]
base = image
page_break = any
```
Note that the newly defined styles will also determine the styling of the page-breaking element. To style them like other elements in the document, you need to set their base style to the default style. Refer to the [style log](http://www.mos6581.org/rinohtype/elementstyling.html#style-logs) to figure out which styles these are.
This is undocumented, but rinohtype supports rst2pdf's _PageBreak_. Note that only _PageBreak_ (without argument) is supported (so no _EvenPageBreak_ or _OddPageBreak_). To use it, insert the following into the reStructuredText where you want the break:
.. raw:: pdf
PageBreak
You should avoid inserting page breaks at arbitrary locations to improve page layout, e.g. to avoid widows or orphans. As soon as your document contents change, you'll have to reposition the page breaks. Widow and orphan handling should eventually by handled automatically by rinohtype.
A better use of page breaks is to systematically apply them. For example, before every new section. In this case, the first method is the recommended way of doing this, but I understand that this is not trivial to figure out given the current documentation. A tutorial on document styling should be able to help with that (#168).
That's exactly what I was missing. Thank you!
The problem with breaks on sections is that the section level alone is not enough context in my document: there is a chapter where sections at a certain level should start on a new page, but only in that chapter and not elsewhere. I imagine that this is not an uncommon problem, especially in larger technical reference manuals that rinohtype would be otherwise a good match for.
there is a chapter where sections at a certain level should start on a new page, but only in that chapter and not elsewhere
In this case you can set an ID on the chapter and create new style with a selector that matches all subsections of that chapter.
[page-break-sections : Section(id='chapter-x') / Section]
page_break = any
While the raw pdf PageBreak directive works stand-alone, it doesn't work as a substitution, i.e.:
.. |newpage| raw:: pdf
PageBreak
Is this a restructured text "spec bug", or a Sphinx bug, or something that rinohtype needs to add special handling for, or is there a workaround?
The alternative form using explicit replacements doesn't work either:
.. role:: raw-pdf(raw)
:format: pdf
.. |newpage| replace:: :raw-pdf:`PageBreak`
Summary:
line1
.. raw:: pdf
PageBreak
line2 - on new page
:raw-pdf:`PageBreak`
line3 - same page
|newpage|
line4 - same page
There's something fundamentally broken here, although I imagine it's Sphinx's fault, as such things are notoriously underdocumented and there seem to be no test cases for them.
While the raw pdf PageBreak directive works stand-alone, it doesn't work as a substitution.
As far as I know, substitutions can only be used for inline content. And the only supported substitution directives are _replace_, _image_, _unicode_ and _date_. So I don't think reStructuredText supports what you want to achieve.
Note that inserting page break through is a raw role is not supported, neither by rst2pdf. This simply wouldn't make much sense, inserting a page break in the middle of a paragraph. Sure, you could use the |newpage| substitution in an otherwise empty paragraph, but that feels more like a hack.
You could create a custom directive similar to rst2pdf's page directive. That would just output a raw directive with PageBreak.
There's something fundamentally broken here, although I imagine it's Sphinx's fault, as such things are notoriously underdocumented and there seem to be no test cases for them.
I don't think it's broken,. It's just not an included feature. But it could be useful to have a directive-equivalent of substitutions. Something along these lines:
.. |page-break| directive::
.. raw:: pdf
PageBreak
Use it like this:
.. |page-break|::
That would require changes to the docutils core, however. The following could perhaps be implemented as a third-party directive:
.. define-alias:: page-break
.. raw:: pdf
PageBreak
Use it like this:
.. alias:: page-break
I do have to agree that reStructuredText is often hard to grasp and sometimes very confusing though! And the docutils homepage that's straight from the 90s makes finding answers unnecessarily painful. But I guess it's the best that's available, with perhaps the exception of asciidoc (which I haven't used).
The thing is: all those "unsupported" features should be producing errors or at least warnings. Of course that's a problem with Sphinx and/or docutils, but I find it amusing and saddening at once how unfinished those are. The hoops necessary to make this stuff work for the simplest things seem like something from the mainframe era :( Things like page breaks and macros really should be a zero-friction user experience in anything that's not a toy project... I'm sorry that you have to indirectly deal with the sad state of Sphinx.
The thing is: all those "unsupported" features should be producing errors or at least warnings.
I didn't try this before, but now I see that .. |newpage| raw:: pdf is indeed not producing any errors. It seems it is (almost) equivalent to a custom raw-pdf role, like you suggested:
.. |newpage| raw:: pdf
PageBreak
.. role:: raw-pdf(raw)
:format: pdf
|newpage| is almost equivalent to :raw-pdf:`PageBeak`
produces this document tree (rst2pseudoxml.py output):
<document source="test.rst">
<substitution_definition names="newpage">
<raw format="pdf" xml:space="preserve">
PageBreak
<paragraph>
<raw format="pdf" xml:space="preserve">
PageBreak
is almost equivalent to
<raw classes="raw-pdf" format="pdf" xml:space="preserve">
PageBeak
So it seems there is support for a _raw_ substitution directive, but it isn't documented. You can open an issue with docutils to address this, if you like. Of course, this will not solve your original problem since these produce inline content.
The hoops necessary to make this stuff work for the simplest things seem like something from the mainframe era :(
I have briefly fantasized about writing a new reStructuredText parser in the past, but I believe I am unlikely to produce anything that performs better than docutils in any realistic timeframe. I'm afraid there is just a lot of complexity to parsing a structured text syntax, simply because it is not so clearly defined as, say XML.
I think the best way to deal with this problem is to help improve docutils and work on improved documentation available through an 'official' domain like restructuredtext.org or restructuredtext.net. With respect to the latter, perhaps you are interested in helping setting up something like that? I'm sure we could easily find a handful of people that would be interested in working on that, for example the folks over at ReadTheDocs.