Jupyter Book currently lets you split sub-sections of files with headers above each using the following structure:
- header: my header
- file: myfile
- file: second file
- header: my second header
- file: third file
- file: fourth file
This will then make the assumption that these should be split into two groups of files w/ two headers.
Under the hood, Sphinx treats these groups of files as chapters. So, I wonder if we should just be explicit about this in _toc.yml, and instead accomplish the above with:
- chapter: My header
sections:
- file: myfile
- file: second file
- chapter: My second header
sections:
- file: third file
- file: fourth file
This is slightly more verbose and in some ways maybe less-flexible. But, it's also more explicit, makes fewer assumptions, and I think is easier to understand. It also opens the path for any toctree-based functionality in Sphinx, like so:
- chapter: My header
numbered: true
sections:
- file: myfile
- file: second file
or even wacky stuff like
- chapter: My header
glob: true
sections:
- file: folder/*
Another approach we could take is to simply generalize our current structure so that people can explicitly provide groups of files. Like so:
- section: My header
files:
- myfile
- second file
- section: My second header
files:
- file: third file
- file: fourth file
and then this could be arbitrarily nested, like so:
- section: My header
files:
- file: myfile
- file: second file
sections:
- section:
- file: third file
- file: fourth file
What do folks think? I imagine that this would make @jstac and @mmcky happier? What about @chrisjsewell? Would also be curious whether @bmcfee would consider this an improvement
Yes, I definitely agree with this. It makes the mapping from content to book clearer.
Yeah, sounds great to me!
I am partial to the second option that you suggest @choldgraf. I feel like the nested syntax could open up to other tings in the future, like using a section as a link that goes to another book/html page.
I was just learning about ToC structure today, and the "header" idea was a bit mysterious to me. I was looking for a way to specify the standard chapter/section distinction, and wondered...is header what I should use? So I love the idea of removing the mystery by getting explicit in one of the ways you suggest. I prefer the first, because it's more like an actual book, which has chapters containing sections, etc., but I don't really care, as long as the docs are clear.
I also prefer the first option because I like rules that are explicit and easy to understand.
@choldgraf I would prefer option 1 as it seems clear for users and maps more directly to a toctree concept for books. I think this would be a good default option. A nice aspect of that structure is that (as you say) allows mapping to sphinx toctree options.
In the future I also wouldn't mind if we introduce a type setting to the _toc.yml that allows for different syntax based on end user use case. If someone just wants to use the platform to build a docs site they may want a simple linear file listing (without using chapter) or we could implement option 2.
For example a header option
type: website
Alternatively we could offer a different cmd tool name for building other types that use different toc structures more suited to that use case (i.e. jupyter-doc for example)
I think a book style nomenclature for toc.yml would be a good default for jupyter-book.
so what I'd imagine is actually not deprecating the current behavior, but instead allowing for chapter: as an extension of current behavior (though we would deprecate header:).
So either of these structures:
- file: index - file: index
- file: page1 - sections:
- file: page2 - file: page1
- file: page2
would result in the following toctree in index:
`
{toctree}
page1
page2
````
While either of these *new* structures are how you would specify chapters:
and they would map on to the following toctrees in `index`:
````
```{toctree}
:caption: My chapter name
page1
page2
```{toctree}
:caption: My other chapter name
page3
page4
```
````
furthermore we'd say that only the top-level file is one that can take a chapter: entry in its sections:. And after the first level, all section nesting etc behaves the same way it does now.
So really the only difference between the current setup and the new one would be that if you wish to specify chapters, you do so by explicitly adding - chapter: entries under the first page's sections: key with each file in that chapter explicitly listed inside, instead of by adding - header: entries to the sections: key in a flat hierarchy w/ the other files as it is now.
does that make sense?
@choldgraf can I clarify one point as I am a bit confused about the chapter header.
For the majority of quantecon projects the structure (i.e. chapter titles) are already defined within each file as markup text. Thus the primary header for the main index toc is largely already defined just by a linear file listing.
In the proposed structure above would you need to define chapter titles in the _toc.yml as toc captions rather than markup in each page?
The main benefit I see to keeping titles in the markup (rather than in a configuration file) is that it reduces duplication in markup vs. configuration. The exception to this is using :caption: for local contents elements (i.e. at the beginning of a chapter if you want a named contents -- which provides a local toc within a chapter). Such as on this page: https://python.quantecon.org/complex_and_trig.html
well in the above example, you can do it either way. In either of these cases:
- file: index - file: index
- file: page1 - sections:
- file: page2 - file: page1
- file: page2
the titles of the top-level pages would become the chapter headers
while in either of these cases
- file: index - file: index
- chapter: My chapter name sections:
sections: - chapter: My chapter name
- file: page1 sections:
- file: page2 - file: page1
- chapter: My other chapter name - file: page2
sections: - chapter: My other chapter name
- file: page3 sections:
- file: page4 - file: page3
- file: page4
the chapter titles would be explicitly defined (and the titles of each top-level page would become sections within the chapter).
This should map on to however Sphinx handles the file hierarchy as well - at the end we are going to be doing the same thing, just turning these into toctree objects on each page.
(another option could be to use the word chapters: instead of sections: only for the first level of the _toc.yml but that would be purely a difference in name rather than a functional one.
So in the case of quantecon since you don't specify chapter groupings with :caption: anyway, you'd just leave the same flat hierarchy structure and nothing should change. Does that make sense?
Let me know if I'm not understanding where your confusion comes from.
thanks for the clarification @choldgraf -- I see -- so this is extended syntax or those that would like to split a book up into sections etc. but reatain the ability or organise chapters that may spread across multiple files. For those authors that want to write each chapter in a single file then the simpler linear listing is still available.
I am in favour of Option 1 syntax as I think that it involves less cognitive overhead to put together for a book.
OK I took a shot at implementing this here: https://github.com/executablebooks/jupyter-book/pull/817
would love comments there as to whether this is a good solution here. If folks are +1 then I'll write tests and we can merge.
Hey all - I went ahead and merged this so that folks can have a chance to use these patterns and give feedback. Please do so!
Cool, new toys! Here are my testing results. Some show successes, some do not.
I tried the following _toc.yml structure:
- file: intro.ipynb
- chapter: Week 1
sections:
- file: chapter-1-intro-to-data-science.ipynb
- file: chapter-2-mathematical-foundations.ipynb
- chapter: Week 2
sections:
- file: chapter-3-jupyter.ipynb
- file: chapter-4-review-of-python-and-pandas.ipynb
- chapter: Week 3
sections:
- file: chapter-5-before-and-after.ipynb
- file: chapter-6-single-table-verbs.ipynb
# etc.
Results
Since this relates to #809, I'll make some comments about how numbering works with this new setup.
numbered: true inside the first - file of _toc.yml has no effect in either HTML or PDF. (The docs say it should number everything.)numbered: true inside every chapter instead, it does number every chapter, but it numbers them independently. That is, every chapter starts over with each of its files numbered 1, 2, etc., as if there had been no previous chapters. That's a step in the right direction! 馃槃 Note: My public repo does not have the above _toc.yml structure; I'm just experimenting with it locally to see what works.
@nathancarter think we could track this one in a different issue? The TOC numbering problem might be a Sphinx problem, as opposed to something we can easily fix in Jupyter Book. I think figuring out the issues around section ordering / numbering is something we'll need to track ongoing
Sure, split the issue however you like. I see two things that are unresolved at this point: One is getting the PDF chapter/section structure to mirror that of the HTML. The other is getting the numbering correct. But feel free to organize them into issues however you like!
I think we should re-open this issue because I just realized that the semantics of what we're using in _toc.yml doesn't map on to what Sphinx expects.
We are currently allowing this:
- file: intro
- chapter: My chapter
sections:
- file: section1
- file: section2
However, Sphinx will not treat this as a single chapter. Instead it treats it in the following way:
My chapter <-- toctree caption, has no bearing on actual chapter structure
- section1 <-- chapter 1
- section2 <-- chapter 2
This becomes obvious when you add section numbers to things (the "My chapter" won't get a number) as well as in building the PDF (the "title" of the first - chapter: entry will be included, but none others will be)
so I think we're misleading people when we use "chapter" as the top-level name. Perhaps we could change this to be chapters instead of chapter, then make it clear that the files themselves are still the chapters, not the "title" for the group. What do folks think?
@choldgraf based on discussion in #809 -- I think what is currently implemented maps more cleanly to part.
However I think the key issue here might be that we are overloading on a caption attribute of the toc. Effectively the css in the theme is converting the toc caption and styling it to look like a chapter heading. Is this right? In projects I have worked on -- we usually have the chapter title data in the file and a toctree with no caption.
******************
Chapter Heading
******************
.. toctree::
I think progress will be hard without clear answers to these three questions:
- chapter: and - sections: lines in the _toc.yml? #, ##, and other headers in the .md/.ipynb sources?I'm suggesting that those three questions be answered in the sense of what the goals are (not what the current behavior is), and then those goals can become both documentation and testing for the software itself. Until we have clear goals, it's hard to even define progress.
So here's how the _toc.yml structures map on to Sphinx syntax, and how Sphinx syntax maps on to "parts of a book". I think those are the mappings we need to figure out (and this is where I made an incorrect assumption that led me to re-open this issue).
The following TOC structures:
- file: intro
- file: page1
- file: page2
- file: intro
- chapter:
sections:
- file: page1
- file: page2
Will both lead to the following "toctree" structure in Sphinx:
````
```{toctree}
page1
page2
````
If instead we added a Chapter title to the `_toc.yml` file, like so:
```yaml
- file: intro
- chapter: My title
sections:
- file: page1
- file: page2
it would become a caption in the Sphinx toctree:
````
```{toctree}
:caption: My title
page1
page2
```
````
However, in Sphinx, page1 and page2 will map onto chapters, not sections within a chapter. This is where my assumption was wrong. I thought that if :caption: exists, then all subsections are treated as parts of a chapter named :caption:. But this isn't true, :caption: is just a "pre-amble" to the table of contents.
Moreover, Sphinx treats each toctree as independent from any other toctree. This means that if you have two toctrees they are numbered independently.
So, for this reason, I think that we should rename the - chapter: field to instead be either
- chapters:, because they are actually collections of chapters (each file being one chapter)- parts:, for the same reason but maybe this is easier to remembersomething else?@nathancarter I tried to explain the other part of your question (about how sections within a page relate to book structure) in this documentation I added yesterday: https://jupyterbook.org/customize/toc.html#how-toc-yml-structure-maps-to-book-structure let me know how that looks to you
Yes, that documentation (which I hadn't seen until now) answers all my three questions--great, thanks!
And regarding your Bottom line section above, I'd say a few things.
- part: makes more sense than - chapters:, since we don't give a title to a clump of chapters together, or if we do, we call it a part. (Note: - part: singular, not - parts: plural.)- chapter: a synonym for - file:? Then the _toc.yml can look like so.- part: The Big Bust
- chapter: Lenny and the Sleeze
- chapter: Giving the Slip
- chapter: One Dangerous Dame
- part: Suspicion
- chapter: Not This Guy Again
- chapter: etc.
OK so first of all I want to know what kind of Noire Crime jupyter book you are writing...
second - yeah that kinda structure makes sense to me. I think the first thing to do would be to rename - chapter: to - part:. I think the list underneath gets a bit trickier, because we're dealing with YAML so we can't use structures like the ones you just showed unfortunately. We'd need a chapters: key or something like that. That's why we were using - file: and - url: since they're more explicit what they're pointing to
I gave a shot at supporting - part: in #834 , what do folks think? @mmcky @nathancarter does this make sense to you?
I suspect @choldgraf is getting tired of me popping up and offering highly correlated opinions again and again, but I really like the structure that @nathancarter suggests. It's super clear in terms of the structure of the book and the map to latex.
If that was the only structure available, without any further nesting, then level 2 headers in chapter files become sections in latex, level 3 headings become subsections, etc.
That reduces the cognitive burden on users down pretty closer to zero.
I realize it's a breaking change but that's why it should be done now :grimacing: :wink:
So again we can't use that structure because it's not valid YAML. What we can do is something like:
- part: The Big Bust
chapters:
- chapter: Lenny and the Sleeze
- chapter: Giving the Slip
- chapter: One Dangerous Dame
- part: Suspicion
chapters:
- chapter: Not This Guy Again
- chapter: etc.
That'd require changes in the wording for things. The way the above structure would currently work in #834 would be:
- part: The Big Bust
sections:
- file: Lenny and the Sleeze
- file: Giving the Slip
- file: One Dangerous Dame
- part: Suspicion
sections:
- file: Not This Guy Again
- file: etc.
I guess we could rename "sections" to "chapters" and/or rename "file" to "chapter", but I think the most important thing for now is just that the names don't give incorrect intuition about the underlying book structure, which is what #834 is meant to clear up.
Regarding having a single flat list, I think we should definitely support it, but I'll again say that I think a lot of people find value in having nested pages. Those folks don't want to have one gigantic jupyter notebook that has all of the content for an entire chapter, they want to be able to have little self-contained notebooks that people can run on their own. This is how almost every single textbook at Berkeley (that uses Jupyter Book, anyway) has been written (and these were all written independently by different authors, but all used nested pages). I think there is value to both use-cases and I don't think they are mutually exclusive.
I'd like some quick feedback on whether we can merge https://github.com/executablebooks/jupyter-book/pull/834. It is a minimal change to (I think) correct the mismatch that I describe above (basically, it just renames - chapters: to instead be - parts:).
I think we should do it sooner than later, and make a patch release, because right now we're encouraging users to follow a pattern that isn't semantically correct, and will be deprecated soon.
We can keep iterating in an issue about other ways we want to extend toc.yml, but I think #834 is the crucial one right now
That PR enables the following TOC structure:
- file: intro
- part:
chapters:
- file: chapter1
- file: chapter2
- part: A named part
chapters:
- file: chapter1
- file: chapter2
thanks @choldgraf I think PR #834 is a good change. I am in favour. I will update quantecon-example once the new release is issued.
Ah, I didn't realize I was suggesting invalid YAML. Sorry about that. I am fine with using file and url for the lowest-level objects, if that's easier. I also see that it has the benefit of clarity on what kind of lowest-level thing is being imported, which I hadn't realized was a design factor. In short: I'm cool with #834. Thanks!!
OK - I've merged in that PR, give it a shot everybody and give feedback! I'd like to cut another mini-release soon :-)
Is this just a straight up pip install git+git://github.com/executablebooks/jupyter-book.git? I just did that but running a build complains about WARNING: Unknown key in `_toc.yml`: part...perhaps I didn't install the latest version correctly?
maybe try adding a -U in there?
Although that made the output of the pip command look slightly different at the end, it did not change the results of jupyter-book build ..
I'm having the same problem:
WARNING: Unknown key in _toc.yml: part
WARNING: Unknown key in _toc.yml: chapters.
I've reverted to files/sections for the time being and that works fine.
If you want to do a sanity check starting with a working conda environment -- the book in the repo below is building fine for me on macos catalina:
https://github.com/phaustin/Problem-Solving-with-Python-37-Edition/blob/apache/environment_jb.yml
yeah same here -
conda create -n tmp python
conda activate tmp
pip install git+git://github.com/executablebooks/jupyter-book.git
and then building a book with - part: in the _toc.yml didn't result in any errors.
Can you try uninstalling and re-installing?
OK uninstall and reinstall helped.
Here's what works:
part and chapters components.Here's what doesn't yet work:
My _toc.yml looks like this.
- file: intro.ipynb
numbered: true
- part: Week 1
chapters:
- file: etc etc
- file: etc etc
- part: Week 2
chapters:
- file: etc etc
re: the numbering, can you try the same uninstall and reinstall on sphinx-book-theme? I'm building your book locally and it seems to work just fine (from a numbering standpoint). This must be very frustrating for you :-/
For the "first part" point, I think we'll have to live with this for now. That's just how Sphinx does things under the hood. I'll open up a separate issue about it so that we can track it: https://github.com/executablebooks/jupyter-book/issues/847
No, I'm cool with all these github issues...I understand how the bleeding edge works. I'm glad this project exists, and that you all are patient with a million bug reports. :)
Uninstalling sphinx-book-theme and reinstalling fixed my problem on #823 but did not solve either of the two concerns I mentioned above (one in HTML output, one in PDF output).
@nathancarter could you open another issue about your HTML input numbering problems so we can track it separately from this thread? I opened up #847 to track the PDF output issue you described.
@choldgraf I can also build locally now, but get the same part/chapter error with github actions. Not sure if this should be a separate issue, or if I'm missing something obvious...
@samteplitzky I bet that this will be resolved when we make a new release. Lemme see if there are any quick bugs to fix and I'll cut a new release today if not
ok just made a new release, try upgrading and see if it works?
Yup, that did the trick. Thanks so much!
wohoo!
ok I'm gonna close this one once again as I _think_ it's resolved in a way that is consistent with our (and Sphinx's) mental model of things
This now works for me, too. I no longer use nested parts, because that restarts numbering, which wasn't what I wanted. So I'm all set here!
good to know that worked out for you! send a link to your book! It'd be great to see how it is shaping up :-)
Not to zombie this thread (it's also working great for me, btw!), but is there any plan for explicit chapter numbering? Having them reset in each part really plays havoc with things like equation cross-referencing. (Also, I'd like appendices to be lettered instead of numbered, and I want a pony. :grin: )
@bmcfee let's open up a new issue about that in particular. I believe this will require some changes in Sphinx itself, unless we want to hack together some javascript to manually change things. E.g.:
Most helpful comment
wohoo!
ok I'm gonna close this one once again as I _think_ it's resolved in a way that is consistent with our (and Sphinx's) mental model of things