Building off discussions in #113, I've been thinking about what a single-page document could look like in the context of Jupyter (Note)Book. Specifically, I'm trying to take a step back and think a little on the life cycle of such a document, from authoring and collaborating to rendering.
It'd be great to get feedback on what folks think of this workflow so far, if there's anything I've missed, or if you have any ideas on next steps !
Jupytext allows researchers to create unexecuted Jupyter Notebooks in Markdown, RMarkdown, and several other formats. This may be useful in improving the authoring experience with Jupyter Notebooks, since it allows for readable version control under Git or other Version Control Systems and interacting with the content in a standard text editor.
The distinction between authoring a Jupyter Notebook in RMarkdown and authoring a standard RMarkdown report, then, is that Jupyter Notebooks enable the option to version control generated outputs -- if the author does not require a legible diff. This can be especially useful in situations where the outputs take substantial time and computational resources to generate, as in many science-oriented use cases.
For scientific publishing, several features are still missing from the Jupyter authoring experience. Among these are in-text formatted citations and auto-generated reference lists _à la_ BibTeX, numbered theorems, as well as options to include footnotes and informational sidebars. All of these features are available either natively in Pandoc or as pandoc-filters. Pandoc recently announced it would natively support the Jupyter Notebook format (#94), enabling the possibility of a smooth experience for converting between .ipynb files and other file types.
Open questions about authoring:
When authors want to share a Jupyter Notebook publication with their co-author, it is unclear what object is best to share. Ideally, collaborators would have access to both the inputs and outputs, meaning that Jupytext Markdown is unlikely to be sufficient unless all authors are working in the same environment (e.g., a shared JupyterLab).
Commenting features such as those available in Microsoft Word are often highly desirable for scientists; excitingly, there is ongoing work to render Jupyter Notebooks as Word documents. Another option is to use ReviewNB, a proprietary tool developed to enable cell-level notebook comments. Nonetheless, it would be preferable to avoid tying this workflow to proprietary solutions, and it is therefore advisable that other collaborative workflows are established. At this point, however, such a workflow seems to necessitate reverting to the Jupytext Markdown and using tools such as e.g. HackMD, hackpad, or equivalent to collaborate online.
Open questions about collaborating:
Generating PDFs would provide traditional, static representations of the Jupyter Notebook that are familiar to most academic readers. This would be in addition to the HTML outputs currently created with Jupyter Book.
Radix, developed by the RStudio team and since renamed to Distill, uses pandoc to convert RMarkdown into a Distill-style HTML template. Using pandoc as a conversion engine enables it to capture many of the desired features outlined above in Authoring. A similar template could be adopted for .ipynb files. We might like to add additional features, such as “show/hide” code cells.
For both PDF and HTML, we should continue to link directly to the underlying notebook, both as an optional download (#231) and as an executable object in an associated Binder environment.
Open questions about rendering:
This is great! here are some thoughts from me on a few of the questions here:
Is there a way to track outputs with legible diffs ?
Not that I know of - you can use something like nbdime for diffing notebook outputs (at least, I think it diffs the outputs), but I doubt that'll be a solution for most authors because they want something that works on GitHub. I wonder how far we could push the "dual markdown + ipynb" workflow...then you could use the github diff-er for outputs.
Should bibliography style (e.g., APA) be added as a metadata field for Jupyter, or at least Jupytext ?
That seems reasonable to me - we don't necessarily need this to be a change to the notebook format or anything, we could just choose what kind of metadata to treat as "special". What about something like a field called "bibliography" that would contain either:
?
What would the ideal UI look like for this authoring process ? A GUI where users can switch between the Markdown and Jupyter Notebook views ?
Yeah, something like Jupytext would be great for this if we could get the notebook to automatically re-render whenever the markdown file is updated (and vice-versa). I bet that'd be possible. Then you could just open them side by side if you wished...maybe even create some UI elements that gave 'hackmd'-style buttons (that automatically open them side by side).
Another cool thing would be if you could run code on the markdown file and it'd execute in the corresponding notebook file...hmmmm
Should co-authors review the Jupytext paired .md / .ipynb or the executed Jupyter Notebook ?
IMO it'll require less machinery if we want people to review .md files since we've already got a common pattern of using GitHub for this
If the latter, how would we recommend co-authors to leave comments directly on Jupyter Notebooks ?
That's a good question - I agree this might add a (non-trivial) amount of code and customization.
What barriers are associated with moving to pandoc rendering of Jupyter Books ?
The main things that I can think of:
Lemme know what you think!
There's a ton of great ideas going on here. I'm glad you all are thinking hard about this.
Here's a go at @emdupre's idea surrounding Distill.pub: https://github.com/roualdes/nbdistill. I've only spent a few hours on it. It's definitely incomplete, but thought you all might be interested anyway.
holy cow that is beautiful 🤩🤩🤩
I'd love to find a way to build this into jupyter-book, maybe as a separate theme? I'd love to be able to standardize the design elements across themes as much as possible (e.g. a right sidebar, figures+captions, etc).
Related to that: how much of your distill example depends on modifications to the HTML template? I see there are a bunch of classes etc added in there, are those specific things that distill looks for? I'm wondering if we can achieve the same effects, but using the same basic HTML structure that nbconvert uses...
Thanks. Building this into jupyter-book would be great and I'm happy to help.
Re modifications to the HTML template: While I didn't add Jinja2 blocks or modify any of the blocks that nbconvert defines, I did use many of the HTML tags defined by Distill.
nbconvert side: I started with nbconvert's template html/full.tpl, and then used some of the Jinja2 blocks from the templates in the folder skeleton as well.
distill side: any HTML tag that is prefixed with d- comes from distill's javascript. Distill also has a number of HTML classes to use for specifying the layout. These classes are all prefixed with l-. For instance, in my report.tpl file you can find <div class="l-gutter">. There are other layout options I'd like to use, but haven't gotten to this yet.
OK cool - maybe I don't have a super clear understanding of how Distill works. It sounds like it inserts HTML into the page itself using the distill javascript library? Is that right?
I think for some elements that'd be fine. Do you think that we'd need to have a distill-specific jinja template in that case? Or could we get away with just CSS differences + the updates from the JS library?
My order of preference here is:
Yep, that's it. Distill inserts HTML into the DOM itself.
What we'd need to have depends on where it is we're trying to go, and I'm not super clear on the specifics of the goal yet. Nonetheless, here's an attempt at a well thought out answer with only HTML output in mind.
I think we could use a single HTML template for Jupyter Book if we selectively picked structural elements from Distill to use. For instance, if we only wanted to borrow Distill's citations support. Selecting the right elements would just take some care.
Having two HTML templates, a base and a more specific one, offers a lot of flexibility at the expense of some maintenance. I feel like this option is a reasonable path towards themes, re #262. I think both Tufte CSS and Distill would somewhat naturally allow for theme specific templates.
I worry that option 3 would be too much of a maintenance burden.
After #278 lands, I think that the template should be relatively stable for a bit (potentially...:-) ). I think that PR will also lay the foundation for lots of distill-like elements (e.g. sidebar, etc). Perhaps we can prototype what would be needed to get a distill theme for Jupyter Book? I'm definitely happy to review a PR! :-)
Closing as this should be superceded by beta.jupyterbook.org, specifically with the jupyter-book page command
Most helpful comment
There's a ton of great ideas going on here. I'm glad you all are thinking hard about this.
Here's a go at @emdupre's idea surrounding Distill.pub: https://github.com/roualdes/nbdistill. I've only spent a few hours on it. It's definitely incomplete, but thought you all might be interested anyway.