Notebook: Restoring computation output after disconnect in Notebook

Created on 22 Oct 2015 · 11Comments · Source: jupyter/notebook

Following this discussion https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jupyter/8hyGKoBY6O0/RyEDDyOZAQAJ, I'd like to open an issue to track down progress on this feature.

The idea would be to add the capability to restore output of a running kernel after client reconnection.

Feel free to ask me to move this issue in a more appropriate jupyter project if needed.

Source

hadim

👍53

Most helpful comment

Hey Carreau. I noticed on the google forum you said the code would need a significant refactor, and MinRK said it involved storing outputs on the server. In my case I'm ok with losing the outputs during the disconnect. I just need to be able to reconnect the browser session to the kernel to get new outputs and to get at the data and code I've loaded. Kernel/Reconnect doesn't do it. I've also tried closing the tab and opening a new one from the Jupyter base page.

This is fairly important for me. I live in a rural area with a poor internet connection and my notebook takes about 8 hours to load. I'm pretty well guaranteed to have an interruption during that time.

rshpeley on 19 Oct 2016

👍11

All 11 comments

rshpeley on 19 Oct 2016

👍11

Still looking into this. This is currently just a problem for me on Jupyter, but I'm more interested in getting it solved for JupyterHub. I'm planning to use it to instruct a class at the university and students losing access to their assignment work through a disrupted connection is not an option.

mosh have solved this problem on terminals and it seems to me that using mosh's SSP (State Synchronization Protocol) with Speculation adapted to Jupyter Notebook requirements would be the best solution to this problem (see the mosh paper).

If you want to hand this problem off to JupyterHub that's ok with me, but others may like to see a solution for Jupyter as well.

rshpeley on 23 Oct 2016

This isn't really a JupyterHub problem. It should be fixed in the notebook (specifically JupyterLab), hopefully before too long.

minrk on 25 Oct 2016

👍6

This feature would be great!

post2web on 1 Mar 2017

Loosing the output of a Jupyter notebook if you are not connected to the hosting computer is a problem, although you can do some hacks. Having the capacity of recovering it would be great.

manuelsh on 27 Mar 2017

👍3

Just a side note: Apache Zeppelin notebook with Python interpreter doesn't have this problem, as it handles disconnects or multiple connects during tasks executions transparently. But it has its own problems: it loses interactive output for a running cell after a disconnect, although when a task is done it eventually shows all its output.

a-rodin on 19 May 2017

👍2

Is there any update on this issue?

louismartin on 2 Mar 2020

Not a huge amount beyond the discussion and issues linked above, that I know of.

There is (slow, contributors welcome!) work going on in JupyterLab around real-time collaboration, which also involves moving notebook state to the server, which solves this issue. @rgbkrk, @captainsafia and others have also experimented with moving state to the server in the nteract project. And @williamstein's cocalc implementation of a notebook frontend does have this feature - it saves state on the server, so you can reconnect whenever you want and get the full state.

So there have been some initial experiments (jlab, jupyter_server, nteract), and at least one full implementation tied to a platform (cocalc).

jasongrout on 3 Mar 2020

🚀6

Any news on this? Is there an issue / PR that one can check to track progress on this?

cossio on 11 Sep 2020

👍5

Follow this repository for the related effort: https://github.com/jupyterlab/rtc

rgbkrk on 15 Oct 2020

Hi friends,

I have this problem often, when the output buffer stops (I use Atom/Hydrogen), and I lose visibility of long running processes.

I have found a bit of a workaround which may be helpful to others (but it requires this to be done upfront, i.e. there is no way that I have found to resume output from existing processes).

The solution requires SSH access, and involves redirecting logging to a file as follows:

    import sys
    log_file = 'results.log'
    sys.stdout=(open(log_file,"w"))
    clf = RandomizedSearchCV(RandomForestRegressor(random_state = 0), param_distributions = param_grid, n_iter = 200, cv=3, random_state = 42, verbose = 51, n_jobs=8)
    sys.stdout.close()

My example is arbitrarily for a RandomizedSearchCV.

However, buffering in std.out proved to be an issue, so Magnus Lycka's referenced answer proved helpful in overriding this:

class Unbuffered(object):
   def __init__(self, stream):
       self.stream = stream
   def write(self, data):
       self.stream.write(data)
       self.stream.flush()
   def writelines(self, datas):
       self.stream.writelines(datas)
       self.stream.flush()
   def __getattr__(self, attr):
       return getattr(self.stream, attr)

and replace above sys.stdout=(open(log_file,"w")) with:
sys.stdout=Unbuffered(open(log_file,"w"))

Now, you can ssh into the machine, run docker exec -it image_name bash and tail -f the log file.

Hope this helps, it seems to be the best solution for now.