Binderhub: Allow for disabling of nbviewer

Created on 1 Apr 2019  路  12Comments  路  Source: jupyterhub/binderhub

Currently, all notebooks from Github show a preview using nbviewer.
In some situations this causes a "404: Not Found" message, even though everythign is correct (e.g. when trying to launch a file that is wrapped inside a Docker image).

For example: https://mybinder.org/v2/gh/crestdsl/crestdsl-demo-binder/master?filepath=GettingStarted.ipynb launches a notebook that is not in the repo, but in the Docker image that is launched by the repo.

I believe that providing an url-argument e.g. &preview=false could be useful in these situations to disable the preview (so users aren't confused).

The concerned code places (as far as I see) are the ParameterizedMainHandler in main.py, where it would be necessary to extract the argument and pass it to the loading.html template. Code to add should be something like preview=self.get_argument('preview', True).

The other part is in the template itself (e.g. near Line 27 ) where a {% if preview %}
can be added. (Line 30 already contains a condition that could also be easily extended...)

This issue is to allow feedback / opposition before starting implementing. Thus:

  1. What do you think?
  2. Is my current though process correct? or is my "very simple" solution not as simple after all?

PS: lacking time right now, so PR will come in a few weeks earliest.

help wanted

All 12 comments

I think there are two ways this could go:

  • we add a URL parameter like this (see #712 for naming)
  • or we make our preview smarter so that it hides itself if it can't show something useful

I don't know how much harder the second one is but it would be cool as it doesn't rely on the creator of the link knowing what they are doing and keeps the URLs shorter.

Is it feasible for BinderHub to make a HEAD request to nbviewer to find out if we would show a 404 or other error page and then decide on hiding the preview?

I think there are two ways this could go:

  • we add a URL parameter like this (see #712 for naming)
  • or we make our preview smarter so that it hides itself if it can't show something useful

Does one exclude the other? An URL param would certainly give power to the people to control their own fate (i.e. even if there was something to useful to show?)

It appears to me that the former is the MVP solution. The more sophisticated one is the "nice-to-have".

As for

as it doesn't rely on the creator of the link knowing what they are doing

I guess the default creator won't be using/needing it (default works fine), while "power-users" get more control.

Adding a URL parameter is probably less coding work than making a request to nbviewer.

However once we introduce the URL parameter we are stuck with it "for life". We can't take it away again or change what preview=.. means/does. So as with all these things coding isn't the bottleneck but getting us to some kind of consensus is :)

What about using ..&nopreview without the need of =False as URL parameter?

I like the simplicity of the URL parameter approach. We can always add the "let's try and guess from the nbviewer response" approach later.

What about using ..&nopreview without the need of =False as URL parameter?

My reasoning behind preview=False was that it would prepare the URL for further (heavy) customization of the preview.
Notebook-branding (preview=http://mybusiness.com/myLogo.jpg), and the needs of the felinophile population (preview=funnyCats.gif) could then be implemented reusing the same, already existing parameter. (How about being shown cat-videos while waiting for the notebook to launch?)

PS: This is not a strong opposition to &nopreview, just asking whether a &preview=XYZ could make sense at some point in the future.

Notebook-branding (preview=http://mybusiness.com/myLogo.jpg), and the needs of the felinophile population (preview=funnyCats.gif)

My initial reaction to that is "what? no!". Maybe someone else can weigh in as well on what they think of the current and future use cases.

Yeah, I don't think allowing users to specify a different preview is something we should do at any point. That should be up to the operator, not the user. Disabling is fine, though. If this is really just preventing the display of 404s on nbviewer, it does seem to me that gating the preview on a successful fetch is the 'right' solution, though user opt-out is a simple band-aid.

Being able to gate the preview is a nice feature. The 404 may cause a client to quit too soon.

I would like to see the nbviewer service URL to be parametrizable. The advantage is that installations inside firewalls that are retrieving notebooks in GHE would have a mechanism for rendering. The current renderer can't reach the notebook, even though the internal binderhub can. This could be fixed if would could configure our own nbviewer service - something that already exists in GHE. At the moment, I don't yet know how GHE does it.

What is GHE?

Making the URL used for the iframe configurable sounds like a "no brainer" aka we should definitely do that. A good first PR? It might take a bit more Python/Jinja templating foo to determine if HEAD <nbviewerURL/path/to/notebook> returns a status 200 and only display the preview if so.

What is GHE?

Sorry, GitHub Enterprise. GitHub, but set up by enterprises, ordinarily not available from the internet.

The same case as the OP happens in scikit-learn:

  • the notebook is in the docker image not in the repo
  • we use filepath in the binder URL to link to a specific notebook

It would be great indeed to be able to disable the preview to prevent user confusion.

More details about the scikit-learn context: https://github.com/scikit-learn/scikit-learn/pull/11221#issuecomment-522472749

This commit in my branch (or see diff below) seem to do what I want, i.e. disable the preview if the nbviewer URL can not be accessed. I tested it locally following https://github.com/jupyterhub/binderhub/blob/master/CONTRIBUTING.md#user-interface-changes.

Diff:

diff --git a/binderhub/main.py b/binderhub/main.py
index 3333108..add51da 100644
--- a/binderhub/main.py
+++ b/binderhub/main.py
@@ -1,6 +1,9 @@
 """
 Main handler classes for requests
 """
+import urllib.request
+import urllib.error
+
 from tornado.web import HTTPError, authenticated
 from tornado.httputil import url_concat
 from tornado.log import app_log
@@ -69,6 +72,12 @@ class ParameterizedMainHandler(BaseHandler):

             blob_or_tree = 'blob' if filepath else 'tree'
             nbviewer_url = f'{nbviewer_url}/{org}/{repo_name}/{blob_or_tree}/{ref}/{filepath}'
+
+            try:
+                urllib.request.urlopen(nbviewer_url)
+            except urllib.error.HTTPError:
+                nbviewer_url = None
+
         self.render_template(
             "loading.html",
             base_url=self.settings['base_url'],

This is a first naive attempt (full disclosure: I don't know that much about Tornado), let me know if the approach seems reasonable enough to move to a PR. In particular I am wondering about:

  • is it fine to do a blocking call like this in .get?
  • is there a better way to do this with Tornado rather than urllib?

With #934 we now don't show the preview if it would contain an error so I'll close this issue.

Was this page helpful?
0 / 5 - 0 ratings