Hi all,
With the help of a colleague in DevOps, I've recently started working to move our DS team's infra from JupyterHub and RStudio Server on a large shared AWS EC2 instance to JupyterHub on k8s (AWS EKS deployment). While the doc has substantially improved from when I looked at it a year ago, running through it over the last week revealed a couple inconsistencies between the Z2JH doc and the current state of both EKS and Helm.
(following https://zero-to-jupyterhub.readthedocs.io/en/latest/amazon/step-zero-aws-eks.html as of 2020-04-27)
EKS:
I think the introduction of eksctl as a management tool and the default to managed node groups has substantially changed the structure of the EKS docs since the Z2JH AWS EKS guide was written. Step 8 in the Z2JH doc refers to "step 3 in getting started with eks", but the AWS docs now split between "getting started with eksctl" and "getting started with the AWS management console", so it takes some drilling down to find what's actually being referenced. I think eksctl is the preferred approach here, and it'd be better to simply provide .yml files for the cluster and autoscaler configs. Step 9 of the procedure discusses setting up the ConfigMap and again references the EKS Getting Started guide, but all references seem to have been removed from that portion, though it's treated here and here. I think the introduction of managed node groups perhaps removed the need for that?
Helm:
Helm v3 has completely removed Tiller, so most of the "Setting up Helm" doc can be removed. EDIT: The docs is now assuming Helm v3.
Once I get our final setup finalized, I'll be happy to take a pass at updating the doc and submitting a PR - just thought I'd flag this for anyone else who takes a shot at this in the near future and try to save a few moments of confusion.
Cheers,
Andy
"Yes please!" to helping keep the documentation up to date. Most people in the Z2JH team use GKE so that is what we have most experience with. for all other cloud providers we rely on community members to help out.
On Helm 2 vs Helm 3: I wouldn't update the docs to remove v2. In the main guide there is (I think) a section on setting up helm with hints as to how to do it for helm 3. We are working towards having a helm chart that works with helm 2 and (with minor tweks) also with helm 3. However being able to use this chart with helm v2 will continue to be a requirement for the foreseeable future (this means no helm 3 only features).
@betatim got it - will add explicit notes for the differences between Helm 2 and Helm 3 along with updates for using eksctl. With Helm 2 => 3, it's not an issue with the chart (chart worked fine), it's that Tiller doesn't exist in Helm 3 at all, so setting up Helm is dramatically simplified. Client side usage syntax has changed a little bit, too.
Thanks @andybrnr for noting these things. I am trying to set this up right now and running into issues with helm.
helm upgrade --install $RELEASE jupyterhub/jupyterhub \
â”” -> --namespace $NAMESPACE \
â”” -> --version=0.8.2 \
â”” -> --values config.yaml
Release "jhub" does not exist. Installing it now.
Error: create: failed to create: namespaces "jhub" not found
And also I expected to see the hub here
helm search hub jupyter
URL CHART VERSION APP VERSION DESCRIPTION
https://hub.helm.sh/charts/pangeo/pangeo 20.01.15-e3086c1 An extention of jupyterhub with extra Pangeo re...
https://hub.helm.sh/charts/stable/tensorflow-no... 0.1.3 1.6.0 A Helm chart for tensorflow notebook and tensor...
https://hub.helm.sh/charts/gradiant/jupyter 0.1.2 6.0.3 Helm for jupyter single server with pyspark su...
UPDATE
From reading the helm docs I figured out how to do this
First I created the namespace
kubectl create namespace $NAMESPACE
Then changed the install command
helm install $RELEASE jupyterhub/jupyterhub --namespace $NAMESPACE --version=0.8.2 --values config.yaml
Hi @valmack,
Glad you figured it out! You can also create the namespace within the helm call by including the --create-namespace flag.
One other thing I'll note here that's a "gotcha" we ran into. The EKS docs say "make sure you include at least two availability zones for your cluster", but because of the Persistent Volume Claims being built on EBS, if the user logs in for the first time in us-east-2b, but on a following login is assigned to a node in us-east-2c, the PVC won't be able to mount because ebs volumes only live within their own AZ. Not sure if this is different from how things work in GKE. We hacked around this by specifying multiple AZs in the cluster config but anchoring all the worker node groups to a single AZ, ensuring users PVCs and Nodes are always in the same spot. Not sure if there's a better way to handle this in the future - saw some suggestions that building the PVCs with EFS rather than EBS would solve this, as EFS is available across a region vs a specific AZ. This could be overridden in the spec for the storage_class, I think.
relevant k8s git issues for the PVC issue mentioned:
https://github.com/kubernetes/autoscaler/issues/1431
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md (see the "Common Notes and Gotchas" at the bottom)
Hi @valmack & @andybrnr
I also encountered the same problem with helm when setting this up on google cloud:
helm upgrade --install jhub jupyterhub/jupyterhub --namespace jhub --version=0.8.2 --values config.yaml
Release "jhub" does not exist. Installing it now.
Error: render error in "jupyterhub/templates/proxy/deployment.yaml": template: jupyterhub/templates/proxy/deployment.yaml:26:32: executing "jupyterhub/templates/proxy/deployment.yaml" at openssl rand -hex 32!
However, now that I am about to try the suggested solution posted here, I instead see this error:
error: Get https://34.82.20.3/api/v1/namespaces/kube-system/pods?labelSelector=app%3Dhelm%2Cname%3Dtiller: error executing access token command "/google/google-cloud-sdk/bin/gcloud config config-helper --format=json": err=fork/exec /google/google-cloud-sdk/bin/gcloud: no such file or directory output= stderr=
I think my access token has expired? How do I fix this?
(I looked at the access token in the config file and its expiry is:"2020-05-08T05:25:40Z")
Sorry if this isnt the best place to post this; if you could direct me to the best place I would appreciate.
UPDATE: ...I solved my expired access token issue with this command:
gcloud container clusters get-credentials MY-CLUSTER --zone MY-ZONE
Dear @andybrnr thanks for opening this issue! I'm going through the same process and I agree with you: the AWS EKS instructions need a major overhaul 💪 I am now trying to figure out how to install JupyterHub on AWS EKS using the most instructions on the AWS EKS User Guide. If you are willing to share your latest workflow I would be happy to test and it and merge it with mine before you will submit a PR.
Best wishes,
-Filippo
FYI, I am going the eksctl way!
@filippo82 Did you get anywhere with the eksctl? I'd be happy to contribute too, but I can imaging you & @andybrnr might have already solved some of the initial challenges.
Hi @tracek, I was able to learn a lot regarding eksctl over the past weeks. I was then able to follow a couple of tutorials to set up EKS on Farget. However, I still need to get to the JupyterHub part. I have been quite busy in the past weeks and then I am off to holidays in 2 weeks. I really want to spend more time on these but I don't think I will be able to do so till the mid of September. I'll try to put together my progress on a document and share it here.
Cheers,
-F
I recently launched a BinderHub, part of which required me to consult the Z2JH docs to set up a cluster with EKS and I found them practically impossible to follow because of updates to the EKS docs (no disrespect to the original author of the Z2JH docs!).
I ended up just following the Amazon EKS guide using eksctl word-for-word and once the cluster was set up, I could simply go to the next step in Z2BH docs (Installing Helm). I'd be happy to contribute to a PR to update the current Z2JH docs - I think much of it could just be offloaded to the Amazon EKS docs.
Hi @TomasBeuzen, did you setup EKS on Fargate? or "standard" EKS?
@filippo82 - "standard" EKS :) I'm in a region that does not yet support Fargate + EKS
Most helpful comment
I recently launched a BinderHub, part of which required me to consult the Z2JH docs to set up a cluster with EKS and I found them practically impossible to follow because of updates to the EKS docs (no disrespect to the original author of the Z2JH docs!).
I ended up just following the Amazon EKS guide using
eksctlword-for-word and once the cluster was set up, I could simply go to the next step in Z2BH docs (Installing Helm). I'd be happy to contribute to a PR to update the current Z2JH docs - I think much of it could just be offloaded to the Amazon EKS docs.