Zero-to-jupyterhub-k8s: Dask integration with JupyterHub

Created on 6 Dec 2017 · 36Comments · Source: jupyterhub/zero-to-jupyterhub-k8s

This is just a note that @mrocklin is interested in figuring out Dask integration with JupyterHub. He's figuring out Kubernetes, and even has a Dask helm chart in the works. Can we figure out a way to connect these two things so that people can stand up a JupyterHub that's connected on the backend with Dask? I think this would be a great first step towards HPC+JupyterHub+Kubernetes!

reference

Source

choldgraf

👍3

Most helpful comment

With @consideRatio we were thinking of putting some examples on how all the pieces can be put together to make a rich scientific computing environment. Components I have in mind:

Various kernels (Spark included)
Dask
R Server

It should include a detailed explanation for those less familiar with Docker / k8s and a few simple use cases. It would cover then this issue. I could start writing the docs in a few weeks.

tracek on 4 Sep 2018

❤3

All 36 comments

I'm working actively on this this week. Engagement would be quite welcome.

mrocklin on 6 Dec 2017

I know @yuvipanda is away this week at Kubecon so probably won't be super responsive, maybe @minrk has thoughts on connecting JHubs with resources like Dask?

choldgraf on 6 Dec 2017

w00t, this is awesome! Most of the work for this probably needs doing in the kubespawner project - we have info and ongoing work on this in https://github.com/jupyterhub/kubespawner/issues/76, https://github.com/jupyterhub/kubespawner/issues/79 and https://github.com/jupyterhub/kubespawner/issues/94.

The fundamental idea is that we allow kubernetes API access from the notebook pods, and then you can start/stop dask/spark/tf clusters from inside the notebook. We can then use Kubernetes RBAC / Quota primitives to control how much each resources user can use. The missing step now is to allow kubespawner to spawn each user pod in its own namespace. @foxish and @liyinan926 have been working on making this work with Spark / TensorFlow, and all the work being done there would benefit dask too!

For dask, the way I'm thinking this would work is:

We run a sidecar tiller per notebook
We give kubernetes API access to each notebook
You can then use helm from inside the notebook to control a dask cluster

But I haven't used dask (nor Spark or TF!) at all, so this might not be the ideal one for dask. Let us know how you're thinking this would work and I'll try to stay engaged this week as much as I can :)

Awesome to have you working on this @mrocklin!

yuvipanda on 6 Dec 2017

That approach would work fine for me. Two thoughts:

If we expect the user to deploy systems with Helm then we might want to make UI tools to help them with this. This isn't necessary short term though.
Alternatively, we could co-launch a dask (or other) cluster when we launch the singleuser notebook

Awesome to have you working on this @mrocklin!

Happy to help out, though at the moment my role is probably mostly as a user and active feedback provider. If there is anything that I can do to push on things then please let me know.

mrocklin on 6 Dec 2017

Also FYI @mrocklin do you know about this? https://github.com/kubernetes/charts/tree/master/stable/dask-distributed

looks like somebody has already taken a pass at Dask+Kubernetes. Maybe worth reaching out to whoever did that legwork?

choldgraf on 7 Dec 2017

Yes, that was done by @danielfrg, who works for Anaconda Inc. My chart
builds off of that. I have a PR to that chart as well.

On Wed, Dec 6, 2017 at 8:45 PM, Chris Holdgraf notifications@github.com
wrote:

Also FYI @mrocklin https://github.com/mrocklin do you know about this?
https://github.com/kubernetes/charts/tree/master/stable/dask-distributed

looks like somebody has already taken a pass at Dask+Kubernetes. Maybe
worth reaching out to whoever did that legwork?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/326#issuecomment-349835094,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AASszCWoEKAslUbFGbBkw9J1mBRZbbW6ks5s90MfgaJpZM4Q3Rwv
.

mrocklin on 7 Dec 2017

We are interested in this too.

As I see it there are a few ways this would work:

Add UI options to notebooks to allow users to spin up clusters on demand.
Launch an adaptive scheduler per user when a notebook server is started (related to dask/distributed#1619).
Create a shared cluster which is accessible by all users.

There are pro's and cons to each approach. I'm personally leaning towards the second option.

See also this helm chart created by @mkjpryor-stfc which satisfies the third option on that list.

jacobtomlinson on 11 Dec 2017

I think that from JupyterHub's perspective they should just think about providing controlled access to notebook pods to start kubernetes deployments within a specified namespace, and then clean up that namespace appropriately when the notebook pod itself decays. This is more or less what @yuvipanda mentioned above. Other considerations (the details about what kind of Dask cluster to spin up) are a separable problem that different groups will no doubt want to handle differently.

@yuvipanda what is the right way to engage JupyterHub devs on this?

mrocklin on 12 Dec 2017

In an e-mail conversation @yuvipanda recommended the following steps:

Set up a new JupyterHub with at least version 'v0.6.0-57c88a6' on a
cluster with RBAC disabled
(https://zero-to-jupyterhub.readthedocs.io/en/latest/security.html#use-role-based-access-control-rbac)

Enable kubernetes api Access for singleuser with the following:

hub:
extraConfig: |
   c.KubeSpawner.singleuser_service_account = 'default'

Install helm inside the single user image
Login to the hub, open a terminal, and try doing 'helm init' and
tell me what happens?
If (4) works, try doing a 'helm install' of the dask chart and see
what happens?

I've given this a shot but am running into errors that I can't diagnose:

mrocklin@carbon:~/workspace/pangeo/gce$ gcloud container clusters get-credentials pangeo --zone us-central1-b --project pangeo-181919
Fetching cluster endpoint and auth data.
kubeconfig entry generated for pangeo.
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=mrocklinWgmail.com
clusterrolebinding "cluster-admin-binding" created
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl --namespace kube-system create sa tiller
serviceaccount "tiller" created
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
clusterrolebinding "tiller" created
mrocklin@carbon:~/workspace/pangeo/gce$ helm init --service-account tiller
$HELM_HOME has been configured at /home/mrocklin/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Happy Helming!
mrocklin@carbon:~/workspace/pangeo/gce$ kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'
deployment "tiller-deploy" patched
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo add dask https://dask.github.io/helm-chart/
"dask" has been added to your repositories
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
"jupyterhub" has been added to your repositories
mrocklin@carbon:~/workspace/pangeo/gce$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "dask" chart repository
...Successfully got an update from the "jupyterhub" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. ⎈ Happy Helming!⎈ 
mrocklin@carbon:~/workspace/pangeo/gce$ vi jupyter-config.yaml 
mrocklin@carbon:~/workspace/pangeo/gce$ helm install jupyterhub/jupyterhub --version=v0.6.0-57c88a6 --name=jupyter --namespace=pangeo -f jupyter-config.yaml
Error: timed out waiting for the condition

proxy:
  secretToken: SECRET

singleuser:
  image:
    name: daskdev/pangeo-notebook
    tag: latest
  extraEnv:
    EXTRA_PIP_PACKAGES: >-
        gcsfs
        git+https://github.com/pydata/xarray.git
        git+https://github.com/alimanfoo/zarr.git
    DASK_SCHEDULER_ADDRESS: dask-scheduler:8786

rbac:
  enabled: false

hub:
  extraConfig: |
    c.KubeSpawner.singleuser_service_account = 'default'

mrocklin on 1 Jan 2018

Thanks for checking it out, @mrocklin!

Looks like Google started enforceing RBAC security by default in 1.8. This is great, but for our experimental purposes we can disable it via https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control. Am trying that now.

yuvipanda on 2 Jan 2018

It might be easier to create a default role and add instructions to do that, given RBAC is going to be the default mode going forward and a de-facto standard. We had a discussion about this for helm/tiller in https://github.com/tensorflow/k8s/issues/135.

foxish on 3 Jan 2018

@foxish I totally agree! We're trying to get it set up for a demo later this week however, so trying to get stuff working without RBAC first...

yuvipanda on 3 Jan 2018

👍1

Looks like Google started enforceing RBAC security by default in 1.8. This is great, but for our experimental purposes we can disable it via https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control. Am trying that now.

@yuvipanda I'd like to do this for a JupyterHub instance that I control to share with some collaborators. Looking breifly at the documentation you've linked to above I'm not immediately seeing how to accomplish this. Can you give me a push in the right direction here?

mrocklin on 3 Jan 2018

I'm also still getting this error despite not doing anything particularly exciting

mrocklin@carbon:~/workspace/pangeo/gce$ helm install jupyterhub/jupyterhub --version=v0.5 --name=jupyter --namespace=pangeo -f jupyter-config.yaml
Error: timed out waiting for the condition

proxy:
  secretToken: SECRET

singleuser:
  image:
    name: daskdev/pangeo-notebook
    tag: latest
  extraEnv:
    EXTRA_PIP_PACKAGES: >-
        gcsfs
        git+https://github.com/pydata/xarray.git
        git+https://github.com/alimanfoo/zarr.git
    DASK_SCHEDULER_ADDRESS: dask-scheduler:8786
    GRANT_SUDO: yes

History

gcloud container clusters create pangeo --num-nodes=4 --machine-type=n1-standard-2 --zone=us-central1-b  --cluster-version=1.8.4-gke.1
gcloud container clusters get-credentials pangeo --zone us-central1-b --project pangeo-181919
kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=mrocklinWgmail.com
kubectl --namespace kube-system create sa tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
 helm init --service-account tiller
kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

mrocklin on 3 Jan 2018

@mrocklin that one might be because the pangeo image is very big? try running upgrade?

Also, to disable legacy authorization, you need to go to console.cloud.google.com, 'edit' your cluster, and set 'Legacy Authorization' to 'Enabled'.

yuvipanda on 3 Jan 2018

OK, will do. Is there a way to extend the timeout?

mrocklin on 3 Jan 2018

I think you can specify --timeout=.

yuvipanda on 3 Jan 2018

For those following along from home, we're working on kubernetes-native dask integration in https://github.com/yuvipanda/daskernetes.

yuvipanda on 3 Jan 2018

I'm following along, but not keeping up. This is cool work! Sorry I haven't helped as much as I wanted to.

minrk on 5 Jan 2018

There is an early prototype at http://pangeo.pydata.org

mrocklin on 8 Jan 2018

👏👏👏👏👏👏👏👏

choldgraf on 8 Jan 2018

so where are we on this issue? are we at a place where we could close this and then open up more actionable/specific issues in the future?

choldgraf on 12 Jan 2018

e.g., I think once the tech has stabilized we should write up some docs on how people can do this for their JupyterHubs, but I don't want this issue to get too cluttered to be able to follow

choldgraf on 12 Jan 2018

So action items from here are:

I've filed issues from the z2jh changes we had to do to get pangeo to work (https://github.com/pangeo-data/pangeo/pull/57#issuecomment-357355449 has a list). Our escape hatches were good enough that we could just do this with config.yaml, which I'm very happy about :)
There's work ongoing in github.com/yuvipanda/daskernetes for dask + kubernetes integration, but is not directly tied to JupyterHub in any form (yay!)
There's ongoing kubespawner work in https://github.com/jupyterhub/kubespawner/pull/115 to actually making this secure :)

Long term, I think we should have good docs in z2jh on 'how do I use dask with my z2jh cluster?' that points to... somewhere else. Not sure what to do with this issue just now though.

yuvipanda on 13 Jan 2018

@yuvipanda Since you've opened action items elsewhere, perhaps closing now is an appropriate next step with this issue.

I've labeled it as reference in the interim.

willingc on 27 Feb 2018

Is it possible to share a same dask cluster from multiple users? I mean, when at least one user select a dask image (using profile), a dask cluster is started and they all share the same resources ?

gsemet on 25 Mar 2018

You might consider installing a single dask cluster using the helm chart in stable/dask. http://dask.pydata.org/en/latest/setup/kubernetes-helm.html

Various people on the clusters could then connect up to it. Note though that Dask doesn't do any particular user management. It also expects all users to have the same software environment.

mrocklin on 25 Mar 2018

👍1

@tracek has experience with Dask + JupyterHub and have demonstrated a lot of cool things already! I hope to learn more how this was setup in time.

Is there something we should summarize from this thread and write up in the docs?

consideRatio on 4 Sep 2018

With @consideRatio we were thinking of putting some examples on how all the pieces can be put together to make a rich scientific computing environment. Components I have in mind:

Various kernels (Spark included)
Dask
R Server

It should include a detailed explanation for those less familiar with Docker / k8s and a few simple use cases. It would cover then this issue. I could start writing the docs in a few weeks.

tracek on 4 Sep 2018

❤3

It'd be nice if you could also include a brief introduction on how this compares with Jupyter Enterprise Gateway.

manics on 2 Oct 2018

👍1

I've successfully setup and am using a cluster w/ jupyterlab via the Kubernetes+Helm method. To use jupyterhub, is it recommend to use the same Helm approach (without jupyter server), and use an independent jupyterhub setup? Is it tricky to setup communication between them?

I looked briefly at the native Kubernetes work, but it's not clear to me one can make a heterogenous worker environment (i.e. some GPU workers, some big-disk workers, etc.)

chrisroat on 8 Feb 2020

Look at the kubespawner, it has a feature to define several profile where you can add different requests :)

gsemet on 8 Feb 2020

Thanks for the point @gsemet. The kubespawner homepage indicates this is for launching different notebooks. My use case, related to this issue, is to launch a single dask cluster with different types of workers (pods), each with a different set of dask resources.

I do this currently using a modification of the dask helm repo, with additional worker configurations. It launches several jupyter notebooks by default. I'd like to move to the hub model, where we can have multiple users, persistent disk, etc.

chrisroat on 8 Feb 2020

Not sure if anyone has mentioned it here yet but you may want to check out Dask Gateway.

cc @jcrist

jacobtomlinson on 10 Feb 2020

👍1

Dask Gateway is a Helm chart that can be installed alongside Z2JH, which is what the Pangeo helm chart has done recently. I'm closing this issue as outdated and hard to take an action with regards to.

consideRatio on 5 Oct 2020

This marriage of dask (gateway) and jupyterhub looks super awesome:
https://blog.dask.org/2020/08/31/helm_daskhub

We launch the two helm charts separately, but will migrate to daskhub soon!

chrisroat on 5 Oct 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings