This is just a note that @mrocklin is interested in figuring out Dask integration with JupyterHub. He's figuring out Kubernetes, and even has a Dask helm chart in the works. Can we figure out a way to connect these two things so that people can stand up a JupyterHub that's connected on the backend with Dask? I think this would be a great first step towards HPC+JupyterHub+Kubernetes!
I'm working actively on this this week. Engagement would be quite welcome.
I know @yuvipanda is away this week at Kubecon so probably won't be super responsive, maybe @minrk has thoughts on connecting JHubs with resources like Dask?
w00t, this is awesome! Most of the work for this probably needs doing in the kubespawner project - we have info and ongoing work on this in https://github.com/jupyterhub/kubespawner/issues/76, https://github.com/jupyterhub/kubespawner/issues/79 and https://github.com/jupyterhub/kubespawner/issues/94.
The fundamental idea is that we allow kubernetes API access from the notebook pods, and then you can start/stop dask/spark/tf clusters from inside the notebook. We can then use Kubernetes RBAC / Quota primitives to control how much each resources user can use. The missing step now is to allow kubespawner to spawn each user pod in its own namespace. @foxish and @liyinan926 have been working on making this work with Spark / TensorFlow, and all the work being done there would benefit dask too!
For dask, the way I'm thinking this would work is:
But I haven't used dask (nor Spark or TF!) at all, so this might not be the ideal one for dask. Let us know how you're thinking this would work and I'll try to stay engaged this week as much as I can :)
Awesome to have you working on this @mrocklin!
That approach would work fine for me. Two thoughts:
Awesome to have you working on this @mrocklin!
Happy to help out, though at the moment my role is probably mostly as a user and active feedback provider. If there is anything that I can do to push on things then please let me know.
Also FYI @mrocklin do you know about this? https://github.com/kubernetes/charts/tree/master/stable/dask-distributed
looks like somebody has already taken a pass at Dask+Kubernetes. Maybe worth reaching out to whoever did that legwork?
Yes, that was done by @danielfrg, who works for Anaconda Inc. My chart
builds off of that. I have a PR to that chart as well.
On Wed, Dec 6, 2017 at 8:45 PM, Chris Holdgraf notifications@github.com
wrote:
Also FYI @mrocklin https://github.com/mrocklin do you know about this?
https://github.com/kubernetes/charts/tree/master/stable/dask-distributedlooks like somebody has already taken a pass at Dask+Kubernetes. Maybe
worth reaching out to whoever did that legwork?โ
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/326#issuecomment-349835094,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AASszCWoEKAslUbFGbBkw9J1mBRZbbW6ks5s90MfgaJpZM4Q3Rwv
.
We are interested in this too.
As I see it there are a few ways this would work:
There are pro's and cons to each approach. I'm personally leaning towards the second option.
See also this helm chart created by @mkjpryor-stfc which satisfies the third option on that list.
I think that from JupyterHub's perspective they should just think about providing controlled access to notebook pods to start kubernetes deployments within a specified namespace, and then clean up that namespace appropriately when the notebook pod itself decays. This is more or less what @yuvipanda mentioned above. Other considerations (the details about what kind of Dask cluster to spin up) are a separable problem that different groups will no doubt want to handle differently.
@yuvipanda what is the right way to engage JupyterHub devs on this?
In an e-mail conversation @yuvipanda recommended the following steps:
Enable kubernetes api Access for singleuser with the following:
hub:
extraConfig: |
c.KubeSpawner.singleuser_service_account = 'default'
Install helm inside the single user image
I've given this a shot but am running into errors that I can't diagnose:
mrocklin@carbon:~/workspace/pangeo/gce$ gcloud container clusters get-credentials pangeo --zone us-central1-b --project pangeo-181919
Fetching cluster endpoint and auth data.
kubeconfig entry generated for pangeo.
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=mrocklinWgmail.com
clusterrolebinding "cluster-admin-binding" created
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl --namespace kube-system create sa tiller
serviceaccount "tiller" created
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
clusterrolebinding "tiller" created
mrocklin@carbon:~/workspace/pangeo/gce$ helm init --service-account tiller
$HELM_HOME has been configured at /home/mrocklin/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Happy Helming!
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'
deployment "tiller-deploy" patched
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo add dask https://dask.github.io/helm-chart/
"dask" has been added to your repositories
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
"jupyterhub" has been added to your repositories
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "dask" chart repository
...Successfully got an update from the "jupyterhub" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. โ Happy Helming!โ
mrocklin@carbon:~/workspace/pangeo/gce$ vi jupyter-config.yaml
mrocklin@carbon:~/workspace/pangeo/gce$ helm install jupyterhub/jupyterhub --version=v0.6.0-57c88a6 --name=jupyter --namespace=pangeo -f jupyter-config.yaml
Error: timed out waiting for the condition
proxy:
secretToken: SECRET
singleuser:
image:
name: daskdev/pangeo-notebook
tag: latest
extraEnv:
EXTRA_PIP_PACKAGES: >-
gcsfs
git+https://github.com/pydata/xarray.git
git+https://github.com/alimanfoo/zarr.git
DASK_SCHEDULER_ADDRESS: dask-scheduler:8786
rbac:
enabled: false
hub:
extraConfig: |
c.KubeSpawner.singleuser_service_account = 'default'
Thanks for checking it out, @mrocklin!
Looks like Google started enforceing RBAC security by default in 1.8. This is great, but for our experimental purposes we can disable it via https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control. Am trying that now.
It might be easier to create a default role and add instructions to do that, given RBAC is going to be the default mode going forward and a de-facto standard. We had a discussion about this for helm/tiller in https://github.com/tensorflow/k8s/issues/135.
@foxish I totally agree! We're trying to get it set up for a demo later this week however, so trying to get stuff working without RBAC first...
Looks like Google started enforceing RBAC security by default in 1.8. This is great, but for our experimental purposes we can disable it via https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control. Am trying that now.
@yuvipanda I'd like to do this for a JupyterHub instance that I control to share with some collaborators. Looking breifly at the documentation you've linked to above I'm not immediately seeing how to accomplish this. Can you give me a push in the right direction here?
I'm also still getting this error despite not doing anything particularly exciting
mrocklin@carbon:~/workspace/pangeo/gce$ helm install jupyterhub/jupyterhub --version=v0.5 --name=jupyter --namespace=pangeo -f jupyter-config.yaml
Error: timed out waiting for the condition
proxy:
secretToken: SECRET
singleuser:
image:
name: daskdev/pangeo-notebook
tag: latest
extraEnv:
EXTRA_PIP_PACKAGES: >-
gcsfs
git+https://github.com/pydata/xarray.git
git+https://github.com/alimanfoo/zarr.git
DASK_SCHEDULER_ADDRESS: dask-scheduler:8786
GRANT_SUDO: yes
History
gcloud container clusters create pangeo --num-nodes=4 --machine-type=n1-standard-2 --zone=us-central1-b --cluster-version=1.8.4-gke.1
gcloud container clusters get-credentials pangeo --zone us-central1-b --project pangeo-181919
kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=mrocklinWgmail.com
kubectl --namespace kube-system create sa tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller
kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'
@mrocklin that one might be because the pangeo image is very big? try running upgrade?
Also, to disable legacy authorization, you need to go to console.cloud.google.com, 'edit' your cluster, and set 'Legacy Authorization' to 'Enabled'.
OK, will do. Is there a way to extend the timeout?
I think you can specify --timeout=.
For those following along from home, we're working on kubernetes-native dask integration in https://github.com/yuvipanda/daskernetes.
I'm following along, but not keeping up. This is cool work! Sorry I haven't helped as much as I wanted to.
There is an early prototype at http://pangeo.pydata.org
๐๐๐๐๐๐๐๐

so where are we on this issue? are we at a place where we could close this and then open up more actionable/specific issues in the future?
e.g., I think once the tech has stabilized we should write up some docs on how people can do this for their JupyterHubs, but I don't want this issue to get too cluttered to be able to follow
So action items from here are:
Long term, I think we should have good docs in z2jh on 'how do I use dask with my z2jh cluster?' that points to... somewhere else. Not sure what to do with this issue just now though.
@yuvipanda Since you've opened action items elsewhere, perhaps closing now is an appropriate next step with this issue.
I've labeled it as reference in the interim.
Is it possible to share a same dask cluster from multiple users? I mean, when at least one user select a dask image (using profile), a dask cluster is started and they all share the same resources ?
You might consider installing a single dask cluster using the helm chart in stable/dask. http://dask.pydata.org/en/latest/setup/kubernetes-helm.html
Various people on the clusters could then connect up to it. Note though that Dask doesn't do any particular user management. It also expects all users to have the same software environment.
@tracek has experience with Dask + JupyterHub and have demonstrated a lot of cool things already! I hope to learn more how this was setup in time.
Is there something we should summarize from this thread and write up in the docs?
With @consideRatio we were thinking of putting some examples on how all the pieces can be put together to make a rich scientific computing environment. Components I have in mind:
It should include a detailed explanation for those less familiar with Docker / k8s and a few simple use cases. It would cover then this issue. I could start writing the docs in a few weeks.
It'd be nice if you could also include a brief introduction on how this compares with Jupyter Enterprise Gateway.
I've successfully setup and am using a cluster w/ jupyterlab via the Kubernetes+Helm method. To use jupyterhub, is it recommend to use the same Helm approach (without jupyter server), and use an independent jupyterhub setup? Is it tricky to setup communication between them?
I looked briefly at the native Kubernetes work, but it's not clear to me one can make a heterogenous worker environment (i.e. some GPU workers, some big-disk workers, etc.)
Look at the kubespawner, it has a feature to define several profile where you can add different requests :)
Thanks for the point @gsemet. The kubespawner homepage indicates this is for launching different notebooks. My use case, related to this issue, is to launch a single dask cluster with different types of workers (pods), each with a different set of dask resources.
I do this currently using a modification of the dask helm repo, with additional worker configurations. It launches several jupyter notebooks by default. I'd like to move to the hub model, where we can have multiple users, persistent disk, etc.
Not sure if anyone has mentioned it here yet but you may want to check out Dask Gateway.
cc @jcrist
Dask Gateway is a Helm chart that can be installed alongside Z2JH, which is what the Pangeo helm chart has done recently. I'm closing this issue as outdated and hard to take an action with regards to.
This marriage of dask (gateway) and jupyterhub looks super awesome:
https://blog.dask.org/2020/08/31/helm_daskhub
We launch the two helm charts separately, but will migrate to daskhub soon!
Most helpful comment
With @consideRatio we were thinking of putting some examples on how all the pieces can be put together to make a rich scientific computing environment. Components I have in mind:
It should include a detailed explanation for those less familiar with Docker / k8s and a few simple use cases. It would cover then this issue. I could start writing the docs in a few weeks.