Notebook: dead kernel and restart repeatedly occurred in a notebook on a remote server

Created on 20 Jul 2020  路  3Comments  路  Source: jupyter/notebook

Hi.

I have a notebook with virtualenv environment running both remotely and locally. For the remote run, I forward the remote port to a local port with no-browser option. And then I would get dead kernel quite frequently (only a few minutes of running). I got the following message in the remote terminal when the "dead kernel" happened.

KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel xxxxxx restarted

The same notebook runs under the same virtual environment in the local machine without any problem. What else should I check for the remote machine?
Thanks!

Most helpful comment

the process (commands like ps -ef|grep jupyter can help isolate the kernel process) and try to determine if it's stable

Indeed, the pid is not consistent for the ipykernel. But the pid of jupyter notebook server is stable.

could be associated to the cell being executed at the time

Not the exact same cells, but similar types of cells, such as importing large size of dataset (~GB) and operating on ~millions of samples.

compare environments

Two machines are set up with the same requirements.txt

To sum up, the cause should be the unknown reasons (seems to be a memory limitation) for the remote server killing my ipykernel. I'll come back to update any improvement after a few tinkering.
As a reference, the versions for Jupyter directly related packages:
ipykernel 5.3.3
ipython 7.16.1
ipython-genutils 0.2.0
ipywidgets 7.5.1
jupyter 1.0.0
jupyter-client 6.1.6
jupyter-console 6.1.0

All 3 comments

The auto-restart logic gets triggered when the notebook server can no longer determine the kernel _process_ is alive. It could be that the process terminated for whatever reason, or the process is so busy doing work that the poll method gets starved and the server _thinks_ its no longer alive. So I would check the process (commands like ps -ef|grep jupyter can help isolate the kernel process) and try to determine if it's stable (is the PID consistent).

Given that the kernel runs for a few minutes, I would also try to determine if anything could be associated to the cell being executed at the time. Is it always the same cell in which the restart occurs?

Then, compare environments. Does the remote machine have the same libraries/packages (compare pip freeze outputs) where a deferred reference of a missing package might crash the kernel process? Does the remote machine have at least the same kinds of resources, etc.?

the process (commands like ps -ef|grep jupyter can help isolate the kernel process) and try to determine if it's stable

Indeed, the pid is not consistent for the ipykernel. But the pid of jupyter notebook server is stable.

could be associated to the cell being executed at the time

Not the exact same cells, but similar types of cells, such as importing large size of dataset (~GB) and operating on ~millions of samples.

compare environments

Two machines are set up with the same requirements.txt

To sum up, the cause should be the unknown reasons (seems to be a memory limitation) for the remote server killing my ipykernel. I'll come back to update any improvement after a few tinkering.
As a reference, the versions for Jupyter directly related packages:
ipykernel 5.3.3
ipython 7.16.1
ipython-genutils 0.2.0
ipywidgets 7.5.1
jupyter 1.0.0
jupyter-client 6.1.6
jupyter-console 6.1.0

Providing feedbacks after two months working on a project:)
My problem is about the memory limitation in the remote computer and also the programming optimization.
It's always been an issue when you believe the claimed memory size is what you actually got for a remote machine:( When I requested for a remote with larger memory size, the dead-kernel problem went away. On the other way, if I continue to push the use of memory to a limit in my local, I'll have the dead kernel as well.

With such, this issue can be closed now. Thanks for the help!

Was this page helpful?
0 / 5 - 0 ratings