Jupyter-book: make a github action that can build jupyter books

Created on 23 Mar 2020  ยท  36Comments  ยท  Source: executablebooks/jupyter-book

I was speaking with @trallard today, and she mentioned that it might be possible to automatically build-and-deploy Jupyter Books using GitHub actions.

Right now I have a demo repository to show how to build a book with a github workflow here:

https://github.com/choldgraf/deploy_configurations/blob/master/.github/workflows/main.yml

In that repository:

  • the content/ folder has a bunch of notebooks + markdown files, no TOC and no _config.yml file
  • the action first installs jupyterbook and ghp-import
  • it then runs jupyter-book toc content/ to auto-generate a TOC for the book (though we could use a manually-created one if we wish)
  • it then runs jupyter-book build content/ to build the book's HTML
  • finally, it runs ghp-import content/_build/html to push to gh-pages

That seems fairly straightforward to run on anybody's repository...I wonder if the GitHub Actions marketplace could be used for that.

Would love to hear @trallard's thoughts!

enhancement

Most helpful comment

Another option that I just thought of:

  • Move the cookiecutter into this org (and keep @TomasBeuzen as an admin assuming he is willing to keep maintaining it!)
  • Keep the current jb create mybook behavior with one exception:
  • We add a jb create mybook --cookiecutter option that uses an optional cookiecutter dependency and lets you create a book following cookiecutter instead of the "default" copy/paste job that jb create does right now

All 36 comments

I will have a go at this over the week and report back :) It seems doable (just had a quick look) but will come back if I encounter any issues

Would love to hear how this goes ! I was just starting to think about this but am wondering how it would interact with e.g. re-executing large notebooks. Would be great to know what ends up working for you !

I think that a reasonable first step would be to assume that the notebooks are all already populated with outputs. However in the near future, JB will have the ability to execute and cache the outputs of running a notebook, so if GitHub actions have the ability to persist data from one run to another, then this could make things much faster as people updated their book content

Hey I am working on this so I started with making some improvements to the workflow on my own fork. I however, have encountered some issues as for example. the data can not be found:

https://github.com/trallard/deploy_configurations/runs/545776015?check_suite_focus=true#step:6:113

I know this is contained in the data8 repo and therefore the data is accessed through:

from datascience import *
path_data = '../../../data/'
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

cones = Table.read_table(path_data + 'cones.csv')
nba = Table.read_table(path_data + 'nba_salaries.csv').relabeled(3, 'SALARY')
movies = Table.read_table(path_data + 'movies_by_year.csv')

What I am happy to just replace with the repo/dataset URL unless @choldgraf has a better suggestion (I know this is a "trivial" question but thought I would ask nonetheless)

Also is possible to have a release made (even is a pre-release) to have in the action - this can then be later updated / kept up to date

What I am happy to just replace with the repo/dataset URL unless @choldgraf has a better suggestion (I know this is a "trivial" question but thought I would ask nonetheless)

this works for me - at least as a stop-gap, Data 8 probably needs to update its own textbook as well. I've been mostly just working from the jupyter book docs as the "demo" book to test out on, since it's a bit more predictable and less-complex.

For the release/pre-release, yep that'll happen soon. I think the cli will probably become a beta release on jupyter/jupyter-book soon, and then we'll start making proper releases

Cool! Will finish cleaning my workflow and then building the action

Some thoughts:

  • I will make the content_path configurable so that folks can define their own content path
  • Some assets can be cached (but I know there are certain limitations) - how these would affect this special case I do not know but can run some cases

Is this the one you have been using for testing purposes @choldgraf ?

https://github.com/jupyter/jupyter-book/tree/gh-pages

@trallard some thoughts below:

I will make the content_path configurable so that folks can define their own content path

makes sense to me

Some assets can be cached (but I know there are certain limitations)

I think this will be more valuable the more complex the content. In the most fancy case, we could set execute_notebooks to cache, and then notebooks would have their outputs stored in a little local database that could persist between runs. Then, only notebooks that were updated since the last run would need to be re-executed

Is this the one you have been using for testing purposes

Actually I've been using the docs for this tool (which is itself a jupyter book): https://github.com/ExecutableBookProject/cli/tree/master/docs

Just to share, I adapted @choldgraf's sample GitHub Action for publishing my book to GitHub Pages; the only real difference is that I'm using conda to set up the environment (due to a nonpipable dependency): https://github.com/kyleniemeyer/computational-thermo/blob/master/.github/workflows/deploy-book.yml

thanks for updating this one - just for reference here's the github-actions demo @kyleniemeyer mentioned: https://github.com/executablebooks/github-action-demo/blob/master/.github/workflows/book.yml

also in case it's useful @trallard jupyter-book's rewrite is now available via pre-release on pypi! pip install jupyter-book --pre :-)

Thanks for the ping! I have had extremely busy weeks recently but will have a go at this again this week ๐Ÿ˜ฌ๐Ÿ˜ฌ๐Ÿ˜ฌ

No worries - I am actually pretty happy with how straightforward it is to build books manually https://jupyterbook.org/publish/gh-pages.html

Hey @hamelsmu - wanted to ping you on this one since we'd discussed GHA and Jupyter Book a bit :-)

Thanks I'll take a look soon

@choldgraf I really don't think its worth packaging anything up more and what you already have is really straightforward, and allows folks to customize things a little if they wish. I wouldn't change anything about this. I can think of minor enhancements but they all come at the cost of making it more complicated to understand

# This job installs dependencies, build the book, and pushes it to `gh-pages`
jobs:
  deploy-book:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2

    # Install dependencies
    - name: Set up Python 3.7
      uses: actions/setup-python@v1
      with:
        python-version: 3.7

    - name: Install dependencies
      run: |
        pip install -r requirements.txt

    # Build the book
    - name: Build the book
      run: |
        jupyter-book build .

    # Push the book's HTML to github-pages
    - name: GitHub Pages action
      uses: peaceiris/[email protected]
      with:
        github_token: ${{ secrets.GITHUB_TOKEN }}
        publish_dir: ./_build/html

Maybe another way to frame it, then, is to ask what's the way that people tend to "share" these workflows? Do you think it's enough to say "copy / paste this boilerplate into your own repository" and most users will know what to do from there?

Yeah I think copy / paste this boilerplate is reasonable, incase folks want to customize the boilerplate (there is always some minimal amount of biolerplate for Actions). We could reduce the amount of biolerplate by ~ 4 lines, but that doesn't seem to buy us much in this case.

I don't assume most users will know what to do or know how Actions work, as that is a system that someone has to learn, but I'm not sure how to abstract Actions as a whole away, as you ultimately do have to look at the YAML files one way or the other.

let me know if that makes sense or if there is another idea I am maissing

Hey folks, I completely agree with all the hard work @choldgraf has already put on in this it is really straightforward to set up.

Maybe - could add just a small section on the docs with the boilerplate workflow for those interested in using actions themselves

It would be useful to include the miniconda case as well -- here's what we're using to test a build of Allen Downey's "How to think like a computer scientist". This builds for both windows and ubuntu on a pull requests, and deploys the ubuntu version on a push:

https://github.com/eoas-ubc/think_jupyter/blob/master/.github/workflows/build_or_deploy.yml

@phaustin I think we could add a short mention of using a conda environment to the gh-pages docs. Want to make a quick PR with the config?

@trallard @hamelsmu thanks for the feedback! So it sounds like we are actually quite close to closing this issue already ๐ŸŽ‰

@choldgraf - just as an FYI, I've been auto-building/deploying my books in a few projects now and ended up making a cookiecutter to initiate new projects that includes the above GH Actions workflow .yml file.

I just made PR #814 updating the gh-pages documentation - we may want to add some of the info from this thread into that PR as well?

I really like the cookiecutter! If other folks think it would be useful, want to add a PR to document it as another way people could create a jupyter book? Also, I wonder what you think about donating it to the executablebooks/ organization (and obviously you'd be an admin on that repository as well)?

Sounds good - I didn't want to step on the toes of the jupyter-book create mybookname command which does almost everything you need at the moment. The utility of the CC is that it can pre-populate some of the _config.yml entries, can include GH actions workflows, and comes with some standard open-source project files (e.g., LICENSE, CONTRIBUTING, CONDUCT, etc.). I'd be more than happy to add the CC option to the docs and transfer the repo to the executablebooks org. I'm going to clean it up a little bit first, and can then initiate transferring ownership and making a docs PR :cookie:

@TomasBeuzen hmm - good point about the jupyter-book create thing. Do you think another path forward would be to try to improve the functionality in jupyter-book create so that it can handle the other sorts of use-cases you describe? I think that could all be in-scope for that function (and anyway right now it's basically acting like a cookiecutter but without a dependency on the cookiecutter package) ๐Ÿคท

@choldgraf - hmm it depends what functionality you want.

If you just want the option to add a GH actions workflow, I think it would be possible to add a .github/workflows/deploy-book.yml file to the book_template directory and then add an optional argument to the CLI that will include this file if a user wants it: e.g., jupyter-book create mybookname --include_gha.

However, if you want to give users the option of creating a full "open-source ready repository", with LICENSE, CONTRIBUTING, CONDUCT, README, _config.yml, workflows, etc files populated with user/book information, then I think it's better to stick with cookiecutter.

Overall, I think it's worth retaining the cookiecutter as a more comprehensive templating option, so I guess the question more becomes, do you want to add some of its functionality to jupyter-book create?

This looks interesting, and may be (indirectly) relevant https://github.com/treebeardtech/treebeard; it uses repo2docker to execute notebooks in a GH Action

@choldgraf - I just updated the cookiecutter. It:

  • Provides the same starter files as jupyter-book create
  • Optionally includes a GH Actions workflow file to auto-deploy the book with ghp-import (the user can choose yes/no to include the files when going through the cookiecutter prompts)
  • Additionally provides standard repo documentation (readme, contributing, license, etc), auto-populated with user-info

Here's the generated structure from cookiecutter [email protected]:UBC-MDS/cookiecutter-jupyter-book.git:

โ”œโ”€โ”€ .github
โ”‚   โ””โ”€โ”€ workflows
โ”‚       โ””โ”€โ”€ deploy.yml
โ”œโ”€โ”€ CONDUCT.md
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ my_book
โ”‚   โ”œโ”€โ”€ _config.yml
โ”‚   โ”œโ”€โ”€ _toc.yml
โ”‚   โ”œโ”€โ”€ content.md
โ”‚   โ”œโ”€โ”€ intro.md
โ”‚   โ”œโ”€โ”€ logo.png
โ”‚   โ”œโ”€โ”€ markdown.md
โ”‚   โ”œโ”€โ”€ notebooks.ipynb
โ”‚   โ””โ”€โ”€ references.bib
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ requirements.txt

Happy to transfer this over to the Executable Books org if you still want it (and I can document it in the Jupyter Book documentation through a PR). Otherwise I'll leave it as is.

awesome! what does @executablebooks/ebpteam think about bringing in a cookiecutter-style repository for Jupyter Book into the executablebooks/ org, and documenting its use a bit more in the jupyterbook docs?

I am all for it -- particularly with github actions integration.

I guess my only question is should we consider adding some of the features of the cookiecutter repo to the quickstart with a few setup questions:

  1. Do you intend to use Github Actions to build your project (y/n)?

I guess my thinking here is that cookiecutters (i.e templates in general) can be difficult to setup for new users and a guided quickstart approach may be a better user interface?

Another option that I just thought of:

  • Move the cookiecutter into this org (and keep @TomasBeuzen as an admin assuming he is willing to keep maintaining it!)
  • Keep the current jb create mybook behavior with one exception:
  • We add a jb create mybook --cookiecutter option that uses an optional cookiecutter dependency and lets you create a book following cookiecutter instead of the "default" copy/paste job that jb create does right now

@choldgraf I like that idea -- that leaves a simple jb create for new users and a more advanced option to use a cokiecutter pattern

Okay sounds good to me (and yes happy to maintain!). I'll initiate the transfer in a few days (in case others want to comment here) and I can work on a PR to update the docs and extend the jb create functionality to include a --cookiecutter argument.

@choldgraf - I'm ready to transfer and make a PR to Jupyter-Book. As I don't have permissions to create/transfer repos to executablebooks are you okay for me to transfer to your GH user account and then you can transfer to executablebooks (thumbs up if yes).

@TomasBeuzen I've just invited you as a member of this organization, so you should be able to initiate the transfer and accept it yourself! ๐Ÿ‘

really cool project !

i'm wondering if there is a repo which we can

  • fork and start editing an ipynb file
  • when we push it to github it creates a beautiful book online which we can share via github pages

that would be really helpful for getting started with basic stuff and get going quickly using JB

reading this issue i got to know that such beauty involves some github actions and stuff which i'm not familiar

any help/pointers much appreciated

thanks! cheers

@helonayala have you tried jupyter-book create mybookpath/ --cookiecutter?

This creates a repo from https://github.com/executablebooks/cookiecutter-jupyter-book, which includes such a github action

Was this page helpful?
0 / 5 - 0 ratings

Related issues

muzny picture muzny  ยท  4Comments

sidneymbell picture sidneymbell  ยท  5Comments

utterances-bot picture utterances-bot  ยท  3Comments

darribas picture darribas  ยท  4Comments

nozebacle picture nozebacle  ยท  3Comments