Zero-to-jupyterhub-k8s: 503: service unavailable

Created on 20 Dec 2019 · 8Comments · Source: jupyterhub/zero-to-jupyterhub-k8s

Jupyterhub is configured and working nicely for all users. Occasionally though, one of the users will reach the following error:

503-service-unavailable

His pod is not running, and there are no errors in the pods. It is unclear how we can clear this error and restart his notebook.

status

Source

cslovell

👍1

All 8 comments

Try capture the logs of the hub when you notice this happening. Also a question right away, did this happen in conjunction with a long time of inactivity, or by doing something very compute intensive that may have caused a out-of-memory kind of eviction of the pod?

consideRatio on 20 Dec 2019

Thanks for the fast reply. Here is the logs for this user over the two hours where things seemed to be working, then weren't anymore.

I've reached out to the user to see what s/he was doing when this happened--will update. Thanks again.

james-bond-logdump.txt

cslovell on 20 Dec 2019

If I wanted to get this started for him again, would it simply be enough to restart the "hub" pod?

cslovell on 20 Dec 2019

@cslovell I know too little about what version is used, but yes that is a good measure in 0.8.2 for example, while in the latest development release you may not need to do anything.

From the logs, it seems you the hub cull's the user for being inactive after 24 something minutes. This is configurable, I'd make it a longer duration for most deployments of a jupyterhub I can imagine. 1 hour perhaps ? Aside from that, there seemed to be an issue on shutting the pod down, I don't know why, and the hub entered a problematic state, I'm not 100% about this.

If you are using an old version of the chart though, I think various parts are resolved. Please use the latest development version of the chart (https://jupyterhub.github.io/helm-chart/) and report back if issues persist. I don't think there should be an issue to upgrade from 0.8.2 - we test for that to some degree at least.

consideRatio on 20 Dec 2019

👍1

Excellent, I'll give that a shot. I'm on 0.9-03215dd for the moment. I'll look into 0.9.0-alpha.1.095.3e95dc3 as it's the latest version. Definitely will look into configuring the culling to at least one hour.

cslovell on 20 Dec 2019

👍1

Got it working - upgrading seemed to have worked. I've upped the culling to be a bit longer than an hour. I'll try to work with the analyst to see what happened. And by the way, great job with this project--we're all very grateful for it.

cslovell on 20 Dec 2019

❤1

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/weve-been-seeing-503s-occasionally-for-user-servers/3824/2

meeseeksmachine on 12 Jul 2020

👍1

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/weve-been-seeing-503s-occasionally-for-user-servers/3824/2

Yes, in particular, the attached log.

jamesbond-inactive-to-503-hub-interface.txt