Vscode-jupyter: Allow for Manual Kernel Management (turn off auto-start and auto-end)

Created on 24 Sep 2020 · 19Comments · Source: microsoft/vscode-jupyter

Feature: Notebook Editor, Interactive Window, Python Editor cells

I'd like to request manual kernel management, following the principles of: no kernel will ever be started without the user's explicit intent, and no kernel will ever be shut down without the user's intent.

Description

I just switched to using vscode about a month ago, and I find myself missing a large number of kernel management features that I'm used to from Atom+Hydrogen.

Here are some undesirable behaviors that I've observed:

Quitting VScode will kill any kernels associated with Python interactive windows. There is no way to disable this behavior.
Occasionally, vscode will demand a reload. For example when Remote-SSH loses connection to a remote host, it will eventually give up on reconnecting and force you to reload vscode (there seems to be no way to disable this behavior). When VScode comes back online, any kernels associated with python interactive windows are gone (also no way to disable this). Ergo, there is no way to persist python sessions on a laptop, where power management will inevitably trigger sleep state and SSH disconnect. This is 100% a Python extension issue, because even if the SSH extension didn't force a reload there will always be another buggy extension that does. In fact, I'm afraid to install any new extensions because that would trigger a reload and delete all my work in the Python interactive window.
I can't find a way to attach the Python interactive window to an existing kernel. ("Existing Kernel" means, for example, one I spawned via the Jupyter notebook web interface). I have vscode python configured to use the exact same notebook server, but how do I actually connect to a kernel?
Opening a notebook automatically creates a new kernel with the default python interpreter. The "Disable Jupyter Auto Start" setting doesn't actually fix this. With Jupyter auto start disabled, opening a notebook does nothing (as expected), but the moment I go to select the kernel/interpreter type that I actually want, vscode will in the background auto-spawn a kernel with the current default type before I get a chance to finish navigating the menus and select the type I actually want.

All of these behaviors are driving me crazy. I just want a simple mode of operation where no kernel will be started without my consent, and no kernel will ever be shut down without my consent. I understand how the current behavior may be useful as a default, but there's not even an option for an alternative. For example in Atom+Hydrogen, using automatic kernel detection will trigger automatic kernel management just like vscode, but the moment you manually specify a Jupyter server URL it will switch to manual management.

Why is all of this a big issue for my workflow? Well, I work on machine learning, which seems to be well in scope of "data science" tools. My ML workloads have two key characteristics: (a) training can take hours or days and is not robust to unanticipated restarts (b) a GPU must be allocated for all python processes. Automatic shutdowns can cause me to lose up to days of work if I didn't save the model recently. Automatic startup and shutdown is also an issue because it creates a messy interaction with GPU scheduling/allocation. With Atom+Hydrogen things are so much simpler because I can just start a kernel manually, give it a GPU allocation, and I never have to worry about the tools creating extraneous kernels or deleting all my work when I didn't intend for that to happen.

Microsoft Data Science for VS Code Engineering Team: @rchiodo, @IanMatthewHuff, @DavidKutu, @DonJayamanne, @greazer, @joyceerhl

enhancement

Source

nikitakit

❤2 👍1

Most helpful comment

Thanks for filing this issue. We'll discuss these feature requests in our triage meeting

DonJayamanne on 24 Sep 2020

👍2

All 19 comments

Thanks for filing this issue. We'll discuss these feature requests in our triage meeting

DonJayamanne on 24 Sep 2020

👍2

I'd like to voice agreement, as I have the same frustrations, as do others I have introduced vscode notebooks too. Perhaps the key issue for me is a subset of @nikitakit's concerns: It's extremely difficult to have a persistent kernel across ssh disconnects, vscode restarts, closing the jupyter tab, etc.

Ideally, I would like to use VSCode's jupyter notebooks the way I use Jupyter notebooks in the browser: When I open a notebook, a kernel is started which is associated with that notebook. Regardless of whether I close the tab, close my browser, or restart my laptop (assuming the Jupyter server is running remotely), when I re-open the notebook, the same kernel will be used. This is incredibly helpful, even in simple settings where code may take ~15 mins to run and I want to put my laptop to sleep and get coffee. (Please let me know if there is an existing way to do this, or if this is the wrong place to post!)

achalddave on 7 Oct 2020

Theoretically (haven't tested it) if you start the notebook server yourself and then pick it as a remote server, the notebook should reconnect to the same kernel on reopening (we save the live kernel id for remote sessions).

rchiodo on 7 Oct 2020

The problem with implementing this the way Jupyter does is that there's nothing that would shutdown the server (if we start it). In your example of using the browser, you (or somebody else) started the Jupyter server. You're responsible for shutting it down. In the case where we start the jupyter server (or just the kernel as we do now), we need to close it down at some point.

Would it work if we didn't close kernels on notebook close, but rather only when shutting down VS code? I guess I'm asking if when you take a break from VS code do you leave it running? I believe if we didn't shutdown kernels on notebook close, the sleep of notebook wouldn't shutdown the kernel (could be wrong though, depends upon what VS code does to the extension host process on sleep).

rchiodo on 7 Oct 2020

Thanks for the quick reply! Maybe this is a bug or user error, because I tried the following before posting:

Start Jupyter server manually
Open vscode notebook "a.ipynb", select jupyter server from (1)
"Reload" VScode
Open "a.ipynb"

My understanding is that (2) spawned the kernel, (3) closed the kernel, and (4) spawned a new kernel. Concretely, (4) did not open with the same python state (e.g. variables) as (2) was in before I closed it.

Usually, I don't explicitly close VS code when I take a break. However, when my laptop goes to sleep, my ssh disconnects, and reloading VSCode is a quick solution to reconnecting (as suggested by this dialog)
Screen Shot 2020-10-07 at 10 01 53 AM

Ideally, maybe VSCode should only shut down a kernel if it is also going to shut down the server. Thus, a notebook->kernel mapping is created per server, and is never changed unless the user explicitly requests a new kernel or shuts down the server (manually, or automatically because the server was started by VSCode). Does this make sense? I'm not super familiar with Jupyter servers and kernels except as a user, so I am likely to be missing key issues, but hopefully this provides an understanding of the behavior I think would be useful.

Edit: Note, this was with VSCode-Insiders using the "preview notebook." I can retry this with the standard notebook in a bit.

achalddave on 7 Oct 2020

I believe preview-notebook doesn't work with remote so well. Can you try with the original notebook editor?

What should have happened is:

1) Server is created
2) We create a kernel for an empty notebook if we can't find a match. Make sure you 'save' the notebook.
3) We don't touch the session (kernel should still be running)
4) Open the ipynb and it should load the kernel using the live session id.

rchiodo on 7 Oct 2020

Thus, a notebook->kernel mapping is created per server, and is never changed unless the user explicitly requests a new kernel or shuts down the server (manually, or automatically because the server was started by VSCode).

This should already be happening for remote servers. We only kill kernels for local (owned by us) sessions. For local, we currently close the kernels when closing the window associated with a notebook. We could delay this until the whole of VS code shuts down.

rchiodo on 7 Oct 2020

Huh, you're totally right, my bad. I just tried steps 1-4 again with the ~preview notebook~ standard notebook, and even tried quitting vscode entirely, reloading vscode, and toggling my wifi. In all cases the variables I instantiated in step (2) were maintained when I re-opened the notebook in step (4). I'm not really sure what happened the last time I tried this, so I'll keep trying this and report back if it happens again. Thank you for the help!

Edit: I originally said it was the preview notebook, but I was actually using the standard one. This does not work with the preview notebook.

achalddave on 7 Oct 2020

🎉1

Quick update: I realized I was using the standard notebook, not the preview notebook when I posted above. The preview notebook doesn't seem to use the same kernel after a reload; is there an existing bug I can follow for this? I tried searching but couldn't find it.

achalddave on 7 Oct 2020

This one I believe
https://github.com/microsoft/vscode-python/issues/13265

rchiodo on 7 Oct 2020

Sorry that's the overall 'kernel' fixup for native notebooks. This one is specifically remember remote kernel ids:
https://github.com/microsoft/vscode-python/issues/13249

rchiodo on 7 Oct 2020

❤1

In your example of using the browser, you (or somebody else) started the Jupyter server. You're responsible for shutting it down.

I would like to point out that I'm using vscode with a jupyter notebook server I started myself (I just provide the extension with the URL including a token), and I still lose all my work whenever the SSH extension forces an editor reload. I'm using the Python interactive window.

nikitakit on 8 Oct 2020

I would like to point out that I'm using vscode with a jupyter notebook server I started myself (I just provide the extension with the URL including a token), and I still lose all my work whenever the SSH extension forces an editor reload. I'm using the Python interactive window

Yes the interactive window does not reuse kernels. It would be weird if it did. Perhaps we can detect the editor reload case but not sure.

rchiodo on 8 Oct 2020

Yes the interactive window does not reuse kernels. It would be weird if it did.

I'm sad to hear this. 😢

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

nikitakit on 9 Oct 2020

👍1

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

Did Hydrogen leave the kernel around even after closing a python file? I think we'd have to do some other way of closing/creating kernels then. Some sort of kernel management story outside of what we do today.

rchiodo on 9 Oct 2020

As someone who used Hydrogen before, I've seen firsthand the benefits of working with plain python files (that have cells delimited by # %%) over the ipynb format (which works poorly for version control, running as a standalone script, and collaborating with people who don't use Jupyter). At this point I could never go back to using notebooks except for simple solo projects that never outgrow a single file.

Did Hydrogen leave the kernel around even after closing a python file? I think we'd have to do some other way of closing/creating kernels then. Some sort of kernel management story outside of what we do today.

When connected to an outside notebook server, kernels would remain open until you shut them down manually (or stopped the notebook server itself).

nikitakit on 9 Oct 2020

How did it know which kernel to use for a file? Did it associate one per python file?

rchiodo on 9 Oct 2020

How did it know which kernel to use for a file? Did it associate one per python file?

I used hydrogen in a kernel-per-file mode. When you first open a file, it starts out with no associated kernel. Then you can run a command to open up a kernel switcher that lets you select what Jupyter server you want and whether you want to connect to an existing kernel or spawn a new one.

I believe hydrogen also had a global setting for having a single active Python kernel per editor window. With this setting enabled, the Python kernel would be reused automatically, but you wouldn't be able to get a second one without opening a new editor window. (The "one kernel" restriction in this mode is actually per programming language, so you could still have a python kernel for python files and then a Julia kernel in addition to that)

nikitakit on 9 Oct 2020

👍1

Huh, you're totally right, my bad. I just tried steps 1-4 again with the preview notebook standard notebook, and even tried quitting vscode entirely, reloading vscode, and toggling my wifi. In all cases the variables I instantiated in step (2) were maintained when I re-opened the notebook in step (4). I'm not really sure what happened the last time I tried this, so I'll keep trying this and report back if it happens again. Thank you for the help!

For completeness (apologies for deviating from the original issue), I'd like to update and say this seems to only work if the _kernel_ was started outside VSCode (e.g., by opening the notebook in the browser once and then attaching to that kernel manually). Even if the jupyter server is started outside of VSCode, but the kernel is started by VSCode when opening a file, VSCode kills the kernel when the tab is closed.

I used hydrogen in a kernel-per-file mode. When you first open a file, it starts out with no associated kernel. Then you can run a command to open up a kernel switcher that lets you select what Jupyter server you want and whether you want to connect to an existing kernel or spawn a new one.

This sounds like an incredibly useful feature. The main reason I'm trying to use notebooks instead of the interactive window is because I need persistence, and I figured notebooks would be easier to make persistent than interactive windows.

However, as far as I can tell, the hacky fix for both notebooks and interactive windows is to start the python _kernel_ (not just the server) outside of VSCode and manually attach to that kernel. For notebooks, this setting will be remembered across sessions. For interactive windows, you'll need to re-attach to that kernel every time the tab or window is closed, but VSCode won't kill the kernel, at least on my version (1.50.0-insider, python extension 2020.9.114305). Not sure if there's an easier way.

achalddave on 9 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings