I installed JupyterHub using the version 0.9.0 chart. After installation I am unable to launch a user pod and see the following error:
[W 2020-05-05 01:02:36.919 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`. Did you mean `browser`?
[I 2020-05-05 01:02:37.506 SingleUserNotebookApp extension:158] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 2020-05-05 01:02:37.507 SingleUserNotebookApp extension:159] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2020-05-05 01:02:37.838 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
[E 2020-05-05 01:02:57.847 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://10.43.183.4:8081/hub/api (attempt 1/5). Is it running?
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
resp = await client.fetch(self.hub_api_url)
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
[E 2020-05-05 01:03:19.872 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://10.43.183.4:8081/hub/api (attempt 2/5). Is it running?
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
resp = await client.fetch(self.hub_api_url)
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
[C 2020-05-05 01:03:20.939 SingleUserNotebookApp notebookapp:1615] received signal 15, stopping
Traceback (most recent call last):
File "/opt/conda/bin/jupyterhub-singleuser", line 10, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 660, in main
return SingleUserNotebookApp.launch_instance(argv)
File "/opt/conda/lib/python3.7/site-packages/jupyter_core/application.py", line 268, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 563, in start
ioloop.IOLoop.current().run_sync(self.check_hub_version)
File "/opt/conda/lib/python3.7/site-packages/tornado/ioloop.py", line 526, in run_sync
self.start()
File "/opt/conda/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 148, in start
self.asyncio_loop.run_forever()
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 534, in run_forever
self._run_once()
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 1735, in _run_once
event_list = self._selector.select(timeout)
File "/opt/conda/lib/python3.7/selectors.py", line 468, in select
fd_event_list = self._selector.poll(timeout, max_ev)
File "/opt/conda/lib/python3.7/site-packages/notebook/notebookapp.py", line 1616, in _signal_stop
self.io_loop.add_callback_from_signal(self.io_loop.stop)
AttributeError: 'SingleUserNotebookApp' object has no attribute 'io_loop'
rpc error: code = Unknown desc = Error: No such container: ecf88dc09c1474cc189fe644624a72e1270e03b2c1325ee7ce18f3cbdf944dfa
User pod launches and displays JupyterHub.
User pod does not launch and displays "Spawn failed: Server at http://10.42.16.54:8888/user/ilhaan/ didn't respond in 30 seconds"
helm install jhub jupyterhub/jupyterhub --version=0.9.0 --values=config.yaml
config.yaml contents:
proxy:
secretToken: "<token>"
service:
type: ClusterIP
ingress:
enabled: true
hosts:
- jupyter.ilhaan
tls:
- secretName: ilhaan-cert
hosts:
- jupyter.ilhaan
singleuser:
storage:
# Following will disable persistent storage for users
type: none
https://jupyter.ilhaan and login I have verified that the hub service is running at 10.43.183.4, as shown in the log output above.
Please let me know how I can fix this.
I don't understand this yet, but I think you have showed logs from the user pod, isn't that correct?
Looking at the errors, my guess is that the hub pod is crashing following you have started a user pod spawn process. Can you provide logs from the hub pod as well?
Also btw, if you have restarts, you can use kubectl logs .... --previous to get the logs of the pod before the restart that will have cleared the fresh logs.
Here are logs from hub pod:
Loading /etc/jupyterhub/config/values.yaml
Loading /etc/jupyterhub/secret/values.yaml
[I 2020-05-05 02:00:40.641 JupyterHub app:2240] Running JupyterHub version 1.1.0
[I 2020-05-05 02:00:40.641 JupyterHub app:2271] Using Authenticator: ldapauthenticator.ldapauthenticator.LDAPAuthenticator-1.3.0
[I 2020-05-05 02:00:40.641 JupyterHub app:2271] Using Spawner: kubespawner.spawner.KubeSpawner
[I 2020-05-05 02:00:40.641 JupyterHub app:2271] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.1.0
[I 2020-05-05 02:00:40.642 JupyterHub app:1349] Loading cookie_secret from /srv/jupyterhub/jupyterhub_cookie_secret
[W 2020-05-05 02:00:40.661 JupyterHub app:1579] JupyterHub.hub_connect_port is deprecated as of 0.9. Use JupyterHub.hub_connect_url to fully specify the URL for connecting to the Hub.
[I 2020-05-05 02:00:40.681 JupyterHub app:1655] Not using whitelist. Any authenticated user will be allowed.
10
[I 2020-05-05 02:00:40.936 JupyterHub app:2311] Initialized 0 spawners in 0.003 seconds
[I 2020-05-05 02:00:40.937 JupyterHub app:2520] Not starting proxy
[I 2020-05-05 02:00:40.943 JupyterHub app:2556] Hub API listening on http://0.0.0.0:8081/hub/
[I 2020-05-05 02:00:40.944 JupyterHub app:2558] Private Hub API connect url http://10.43.96.151:8081/hub/
[I 2020-05-05 02:00:40.945 JupyterHub proxy:320] Checking routes
[I 2020-05-05 02:00:40.945 JupyterHub proxy:400] Adding default route for Hub: / => http://10.43.96.151:8081
[I 2020-05-05 02:00:40.948 JupyterHub app:2631] JupyterHub is now running at http://10.43.106.55:443/
[I 2020-05-05 02:00:47.139 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.97ms
[I 2020-05-05 02:00:57.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.74ms
[I 2020-05-05 02:01:07.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.71ms
[I 2020-05-05 02:01:17.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.72ms
[I 2020-05-05 02:01:23.972 JupyterHub log:174] 302 GET / -> /hub/ (@10.8.65.242) 0.83ms
[I 2020-05-05 02:01:24.136 JupyterHub reflector:199] watching for pods with label selector='component=singleuser-server' in namespace jupyterhub
[I 2020-05-05 02:01:24.213 JupyterHub reflector:199] watching for events with field selector='involvedObject.kind=Pod' in namespace jupyterhub
[I 2020-05-05 02:01:24.215 JupyterHub log:174] 302 GET /hub/ -> /hub/spawn ([email protected]) 181.81ms
[W 2020-05-05 02:01:24.939 JupyterHub base:950] User ilhaan is slow to start (timeout=0)
[I 2020-05-05 02:01:24.940 JupyterHub log:174] 302 GET /hub/spawn -> /hub/spawn-pending/ilhaan ([email protected]) 701.55ms
[I 2020-05-05 02:01:25.059 JupyterHub pages:347] ilhaan is pending spawn
[I 2020-05-05 02:01:25.081 JupyterHub log:174] 200 GET /hub/spawn-pending/ilhaan ([email protected]) 26.30ms
[I 2020-05-05 02:01:27.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.71ms
[I 2020-05-05 02:01:37.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.77ms
[I 2020-05-05 02:01:40.952 JupyterHub proxy:320] Checking routes
[I 2020-05-05 02:01:47.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.69ms
[I 2020-05-05 02:01:57.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.75ms
[I 2020-05-05 02:02:07.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.72ms
[W 2020-05-05 02:02:11.938 JupyterHub user:692] ilhaan's server never showed up at http://10.42.16.57:8888/user/ilhaan/ after 30 seconds. Giving up
[I 2020-05-05 02:02:11.938 JupyterHub spawner:1866] Deleting pod jupyter-ilhaan
[I 2020-05-05 02:02:17.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.73ms
[E 2020-05-05 02:02:18.348 JupyterHub gen:599] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py:845> exception=TimeoutError("Server at http://10.42.16.57:8888/user/ilhaan/ didn't respond in 30 seconds",)> after timeout
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 593, in error_callback
future.result()
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 852, in finish_user_spawn
await spawn_future
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 668, in spawn
await self._wait_up(spawner)
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 715, in _wait_up
raise e
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 683, in _wait_up
http=True, timeout=spawner.http_timeout, ssl_context=ssl_context
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 234, in wait_for_http_server
timeout=timeout,
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 177, in exponential_backoff
raise TimeoutError(fail_message)
TimeoutError: Server at http://10.42.16.57:8888/user/ilhaan/ didn't respond in 30 seconds
[I 2020-05-05 02:02:18.350 JupyterHub log:174] 200 GET /hub/api/users/ilhaan/server/progress ([email protected]) 52268.94ms
[I 2020-05-05 02:02:27.138 JupyterHub log:174] 200 GET /hub/health (@10.192.8.138) 0.91ms
And logs from user pod for the same session shown above:
[W 2020-05-05 02:01:29.948 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`. Did you mean `browser`?
[I 2020-05-05 02:01:30.552 SingleUserNotebookApp extension:158] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 2020-05-05 02:01:30.552 SingleUserNotebookApp extension:159] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2020-05-05 02:01:30.878 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
[E 2020-05-05 02:01:50.885 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://10.43.96.151:8081/hub/api (attempt 1/5). Is it running?
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
resp = await client.fetch(self.hub_api_url)
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
[C 2020-05-05 02:02:12.233 SingleUserNotebookApp notebookapp:1615] received signal 15, stopping
Traceback (most recent call last):
File "/opt/conda/bin/jupyterhub-singleuser", line 10, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 660, in main
return SingleUserNotebookApp.launch_instance(argv)
File "/opt/conda/lib/python3.7/site-packages/jupyter_core/application.py", line 268, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 563, in start
ioloop.IOLoop.current().run_sync(self.check_hub_version)
File "/opt/conda/lib/python3.7/site-packages/tornado/ioloop.py", line 526, in run_sync
self.start()
File "/opt/conda/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 148, in start
self.asyncio_loop.run_forever()
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 534, in run_forever
self._run_once()
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 1735, in _run_once
event_list = self._selector.select(timeout)
File "/opt/conda/lib/python3.7/selectors.py", line 468, in select
fd_event_list = self._selector.poll(timeout, max_ev)
File "/opt/conda/lib/python3.7/site-packages/notebook/notebookapp.py", line 1616, in _signal_stop
self.io_loop.add_callback_from_signal(self.io_loop.stop)
AttributeError: 'SingleUserNotebookApp' object has no attribute 'io_loop'
Only this is the relevant logs from the user pod, the other starting with signal 15 (SIGTERM) relates to the shutdown sequence that followed the inability to startup correctly.
[W 2020-05-05 02:01:29.948 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`. Did you mean `browser`?
[I 2020-05-05 02:01:30.552 SingleUserNotebookApp extension:158] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 2020-05-05 02:01:30.552 SingleUserNotebookApp extension:159] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2020-05-05 02:01:30.878 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
[E 2020-05-05 02:01:50.885 SingleUserNotebookApp singleuser:438] Failed to connect to my Hub at http://10.43.96.151:8081/hub/api (attempt 1/5). Is it running?
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/jupyterhub/singleuser.py", line 432, in check_hub_version
resp = await client.fetch(self.hub_api_url)
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
So, why wasn't the pod able to communicate with the hub? I don't know, but I suspect it relates to your Kubernetes setup. Perhaps there is some policy in your Kubernetes cluster blocking all network traffic or similar?
To debug this further, I would recommend doing the following instead of starting up a user.
# create a pod from where you can use curl
kubectl run -it --rm --image curlimages/curl --restart=Never -n my-jhub-namespace my-debug-pod -- sh
# try manually curling the hub pod
curl http://$HUB_SERVICE_HOST:$HUB_SERVICE_PORT/hub/api
Do you get a response of {"version": "1.1.0"} back? If not, this is the core issue, why isn't the hub pod reachable? I would then debug this further by curling the hub pod from the hub pod itself.
# get a process started in the hub pod and run a shell
kubectl exec -it -n my-jhub-namespace deploy/hub -- bash
# use it to curling the jupyterhub server from within the container
curl http://localhost:8081/hub/api
Go ahead and add --verbose if a curl fails to get more information about why.
curl test: ➜ ~ kubectl run -it --rm --image curlimages/curl --restart=Never -n jupyterhub my-debug-pod -- sh
/ $ curl http://$HUB_SERVICE_HOST:$HUB_SERVICE_PORT/hub/api
{"version": "1.1.0"}/ $
➜ ~ kubectl exec -it -n jupyterhub deploy/hub -- bash
jovyan@hub-7dff48fccb-zrvhx:/srv/jupyterhub$ curl http://localhost:8081/hub/api
{"version": "1.1.0"}
--verbose: jovyan@hub-7dff48fccb-zrvhx:/srv/jupyterhub$ curl --verbose http://localhost:8081/hub/api
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8081 (#0)
> GET /hub/api HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: TornadoServer/6.0.4
< Content-Type: application/json
< Date: Tue, 05 May 2020 02:51:33 GMT
< X-Jupyterhub-Version: 1.1.0
< Access-Control-Allow-Headers: accept, content-type, authorization
< Content-Security-Policy: frame-ancestors 'self'; report-uri /hub/security/csp-report; default-src 'none'
< Etag: "cc4735fefe4e49d11f9faa6bd5681e7de3c1f00f"
< Content-Length: 20
<
* Connection #0 to host localhost left intact
{"version": "1.1.0"}
Hmmm... Eh... I don't get it =/ I would inspect if there were any networkpolcy objects + networkpolicy controller like calico that enforced something perhaps.
I would perhaps also try to mimic all the labels that the jupyter user pod had and try to curl the hub with those labels if so, hm... How the k8s cluster is setup is likely of relevance in general. I'm very confident this won't reproduce for me. I've also run 0.9.0 without issues for a while.
Ok. I had 0.9.0-beta.3 running for two months before this, so 0.9.0 not working is strange.
I’ll dig around more and keep you posted. Thanks!
(A small nudge: this discussion looks like a support question (it is unlikely this functionality is broken in Z2JH and more likely it depends on the user setup). We are trying to move all discussions that aren't about changes to the repository contents to the forum which is more accessible, has more eyes watching and better indexed by search engines. I don't know if it makes sense to move this conversation now. Because this change to our comms strategy is new I thought it was worth nudging you for the future :)
Ok so this did turn out to be a networking issue on my end and has been resolved.
Nice! Thanks for reporting back and closing the issue @ilhaan!
Quite often someone else finds their way back to a discussion like this and asking about what the problem actually was - perhaps you could describe what you learned briefly @ilhaan? I'm also curious to learn what could have caused this in a k8s cluster setup.
@consideRatio Sure!
I use Rancher to manage and deploy k8s clusters. I have JupyterHub configured to deploy user pods on a specific node in my cluster. This node had to be removed from the cluster and re-added a few days ago (for reasons outside the scope of our discussion here). When this was done, my node clean up process did not remove network interfaces and Iptables entries from the first time the node was in the cluster. This caused all new pods started on the node to have various networking issues.
If anyone is curious, Rancher has a detailed node clean up process doc here and this is the section on cleaning up network components that helped resolve my networking issues.
Ahh!! Thanks for the follow up @ilhaan!