Our JupyterHub deployment frequently has issues with Google OAuth.
After initialising the hub pod, authentication works for a period (maybe 12 hours or so) and then the following messages appear in the logs of the hub pod (some details redacted, e.g. __FOO__), after which _all_ authentication fails (returning a 500) until the pod is restarted:
2019-01-01 21:36:41.446 JupyterHub web:1670] Uncaught exception GET /hub/oauth_callback?state=__STATE__%3D%3D&code=__CODE__&scope=openid+email+https://www.googleapis.com/auth/plus.me+https://www.googleapis.com/auth/userinfo.email&authuser=0&hd=example.com&session_state=__SESSION_STATE__..3e5c&prompt=none (10.142.0.2)
HTTPServerRequest(protocol='https', host='datascience.example.com', method='GET', uri='/hub/oauth_callback?state=__STATE__%3D%3D&code=__CODE__&scope=openid+email+https://www.googleapis.com/auth/plus.me+https://www.googleapis.com/auth/userinfo.email&authuser=0&hd=example.com&session_state=__SESSION_STATE__..3e5c&prompt=none', version='HTTP/1.1', remote_ip='10.142.0.2')
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tornado/web.py", line 1592, in _execute
result = yield result
File "/usr/local/lib/python3.6/dist-packages/oauthenticator/oauth2.py", line 182, in get
user = yield self.login_user()
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 473, in login_user
authenticated = await self.authenticate(data)
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/auth.py", line 257, in get_authenticated_user
authenticated = await maybe_future(self.authenticate(handler, data))
File "/usr/local/lib/python3.6/dist-packages/oauthenticator/google.py", line 64, in authenticate
code=code)
tornado.auth.AuthError: Google auth error: HTTP 599: gnutls_handshake() failed: An unexpected TLS packet was received.
[E 2019-01-01 21:36:41.449 JupyterHub log:150] {
"Cookie": "oauthenticator-state=\"2|1:0|10:1546378600|20:oauthenticator-state|120:XXXXXXXX==|YYYYYYYY\"",
"Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Referer": "https://datascience.example.com/hub/login",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
"Upgrade-Insecure-Requests": "1",
"X-Scheme": "https",
"X-Original-Uri": "/hub/oauth_callback?state=__STATE__%3D%3D&code=__CODE__&scope=openid+email+https://www.googleapis.com/auth/plus.me+https://www.googleapis.com/auth/userinfo.email&authuser=0&hd=example.com&session_state=__SESSION_STATE__..3e5c&prompt=none",
"X-Forwarded-Proto": "https,http",
"X-Forwarded-Port": "443,80",
"X-Forwarded-Host": "datascience.example.com",
"X-Forwarded-For": "10.142.0.2,10.48.8.26",
"X-Real-Ip": "10.142.0.2",
"X-Request-Id": "__REQUEST_ID__",
"Connection": "close",
"Host": "datascience.example.com"
}
[E 2019-01-01 21:36:41.449 JupyterHub log:158] 500 GET /hub/oauth_callback?state=[secret]&code=[secret]&scope=openid+email+https://www.googleapis.com/auth/plus.me+https://www.googleapis.com/auth/userinfo.email&authuser=[secret]&hd=example.com&session_state=[secret]&prompt=none (@10.142.0.2) 70.37ms
Restarting the hub pod fixes the issue, but it recurs again. Increasing the instance size doesn't seem to help, and the memory/cpu usage on the node on which the hub pod is running is reasonable.
We're deployed on GCP with the v0.8-229848a helm chart (though we've seen this issue for a number of versions).
After reading this post (http://blog.thisisfeifan.com/2015/01/fix-tornado-http-599-issue.html) I'm currently running an experiment to see whether this is a GnuTLS issue by replacing curl and pycurl with versions using OpenSSL using the following Dockerfile:
FROM jupyterhub/k8s-hub:80a76ac
USER root
RUN \
apt-get update && \
apt-get remove -y \
python3-pycurl && \
apt-get install -y --no-install-recommends \
libssl-dev \
libcurl4-openssl-dev \
python3-wheel && \
apt-get purge && \
apt-get clean
RUN \
PYCURL_SSL_LIBRARY=openssl pip3 install pycurl
USER ${NB_USER}
The hub pod running this custom image has been for 18 hours now with no issues (馃)
I will report back after some more time :)
OK - the hub pod with the above changes to the image has been running for a few days now and the error hasn't reoccurred, which is a record for our deployment, I think :)
I had to install libssl-dev in order for the Docker build to work.
You're quite right - I've added it above
Also, we haven't had the original issue since starting using this image :)
I will make a PR to update the original Dockerfile
We have been seeing this as well when switching to gsuite auth. Is the resolution to build our own image for now? It doesn't seem to be in master nor in any PR.
I've just raised a PR so hopefully this will be addressed soon.
We've been running gsuite auth for months now, building our own image via the Dockerfile above, and it is very stable (no restarts required!) - we simpy build the image and reference it via the config.yaml for the chart, e.g.:
...
hub:
image:
name: us.gcr.io/your-fun-project/k8s-hub
tag: latest
...
Wow thank you for investigating this so thoroughly @joshbode! I can imagine this was quite challenging to figure out!
Thanks! It was quite a mystery to solve :)
Hi, this PR reverts the changes made in #1043. The original change was due to npm install conflicted (node-gyp), and npm is required if we want to build image with JH from git.
Hello, I seem to have this exact same issue when using the latest Helm chart (v0.8.2). Is it possible that this issue still persists somehow? Do we need to follow this path to make the deployment work with Google OAuth?
I don't think v0.8.2 includes the fix:
see #1321
some dev release after f56b92b5 (2019-03-11) would be sufficient.
I'm using v0.9-ec48133
The latest version is 0.9-2d435d6, it would be useful to know if the latest version works if the 0.8.2 version didn't.
Okay, I've deployed 0.9-2d435d6 (had to also change the 'hostedDomain' field to a list in the config, I suppose that's going to be a new feature that one can provide a list there). Will check back on it in a few hours to see if it's still running fine. Thank you very much for the quick response!
@consideRatio I recently installed the version 0.8.2 and I am getting the exact same error with Generic OAuth as described on #1321. So this issue is included in it.
Try 0.9.0-beta.4 @fjferdiez
Most helpful comment
After reading this post (http://blog.thisisfeifan.com/2015/01/fix-tornado-http-599-issue.html) I'm currently running an experiment to see whether this is a GnuTLS issue by replacing
curlandpycurlwith versions using OpenSSL using the followingDockerfile:The hub pod running this custom image has been for 18 hours now with no issues (馃)
I will report back after some more time :)