Gitea: Client side rendering of .ipynb has not been ported from gogs

Created on 1 Jan 2018  路  13Comments  路  Source: go-gitea/gitea

Description

The client-side notebook preview functionality has not been backported from the original gogs repo.

https://try.gogs.io/niklas/ipynb-test/src/master/README.ipynb
https://try.gitea.io/memetb/test-ipynb/src/branch/master/Untitled.ipynb

Issue #3025 mentions using RENDER_COMMAND = jupyter nbconvert --to html --stdin --stdout however, there are multiple reasons why this can be problematic:

  • server side-rendering of jupyter notebooks requires the installation of anaconda. This defeats the purpose of having a lightweight executable that can be deployed on a raspberry pi.
  • executing jupyter on large notebooks is slow.
  • having to execute a python interpreter on user data is not safe and would require chrooting the whole thing if a sys-admin needs to be thorough.
kinproposal

Most helpful comment

It would be relatively easy to add ipynb support through an external markup renderer (e.g. using jupyter nbconvert) when #3025 is resolved.
A vanilla version would call the command every time the notebook has to be rendered. Next (and very reasonable) step would be to cache the rendered file until the source file is changed (e.g. scoping the cache to the source file hash). I think the approach is valid for all custom markup rendering, not only jupyter.

I think the right place for the caching layer is inside the custom renderer (which could be in its own docker container). To enable this, we need gitea to pass not only the file content and URLs but also the commit ID to the renderer.

All 13 comments

Gogs uses notebookjs for displaying jupyter notebooks. I still need to research whether notebookjs allows the execution of arbitrary code (like JS code), and in the case it does, it's a no-go and we should instead consider making (or finding) a parser that only displays markup and does not run any user code.

Ok, I've uploaded a version of the notebook with javascript and style sheet injection. The quick answer is that, as configured, it does allow for execution of code. However, this doesn't constitute an audit of the code, perhaps there's a way to disable it.

https://try.gitea.io/memetb/test-ipynb/src/branch/master/Untitled.ipynb
https://try.gogs.io/testbbar/test-ipynb/src/master/Untitled.ipynb

Looking at nbpreview which makes use of notebookjs, I can see that it's possible to remove js. Although I'm not yet sure how it's being done.

Yeah, it seems like code generated from notebookjs should be run through an HTML cleaner/purifier/sanitizer first.

nbpreview is doing something to this effect. The thing works as a standalone html file, so there may be client side library parameters which allow for this.

Either that or it's buggy and not behaving the way it wants to.

Further info:

This is the feature being proposed on gogs.

It fails if I put javascript in the dom events. You're right, a server side javascript filter will likely be necessary.

nbviewer seems to do what we're wanting. (Other link leads to a broken preview).

To test it, just download the ipynb from the above repos and drag and drop.

Fyi, I've attached what the original file looks like (fwiw, none of the three previews so far have rendered the style sheet - which isn't necessarily a bad thing)
screenshot-2018-1-1 untitled

It would seem jupyter themselves (c.f. page 34) advocate for the use of Google Caja. So far, from my searching, I haven't found any other library which does what Caja claims to do.

It would also appear caja has a standalone js sanitizer.

Let me know if you require further research.

@memetb Could you share your config for rendering jupyter notebook files?

@lstolcman I'm not sure what you're asking for. Do you mean standalone? I'm using a standard out of the box anaconda installation, if that's what you mean.

is it planned yet ? It was working on gogs. I had migrated to gitea and ipynb rendering stopped working.

It would be relatively easy to add ipynb support through an external markup renderer (e.g. using jupyter nbconvert) when #3025 is resolved.
A vanilla version would call the command every time the notebook has to be rendered. Next (and very reasonable) step would be to cache the rendered file until the source file is changed (e.g. scoping the cache to the source file hash). I think the approach is valid for all custom markup rendering, not only jupyter.

I think the right place for the caching layer is inside the custom renderer (which could be in its own docker container). To enable this, we need gitea to pass not only the file content and URLs but also the commit ID to the renderer.

"Rendering of jupyter notebook 'rich output' is still missing some capabilities"
https://gitlab.com/gitlab-org/gitlab-ce/issues/32784#note_97778959

...

You may find https://github.com/go-gitea/gitea/pull/12261 and this draft tutorial helpful.

@HarvsG using nbconvert is not a satisfiable solution, due to all the reasons that were mentioned in the first post (security, performance, dependencies). A solution built around a JS library like nbviewer.js would be preferable.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lunny picture lunny  路  3Comments

jorise7 picture jorise7  路  3Comments

thehowl picture thehowl  路  3Comments

jonasfranz picture jonasfranz  路  3Comments

jakimfett picture jakimfett  路  3Comments