Spawning a single-user notebook server on our current (bare-bones) Azure configuration takes IMO way too long time (over two minutes). If I remember correctly, the same action has been quite fast on GKE. Is there something relevant that is specific to Azure?
Unfortunately, changing the prePuller value to true, per advice from @yuvipanda and @consideRatio, with subsequent cluster restart (helm upgrade sbdh-jh-v1 jupyterhub/jupyterhub --version=v0.6 -f config.yaml) did not reduce the delay.
The cluster currently uses Standard D2 v2 instance type (2 vCPU, 7 GB RAM); single node in the pool.
kubectl --namespace=sbdh-jh-v1-v081 describe pod hub-2186441756-6zdlxName: hub-2186441756-6zdlx
Namespace: sbdh-jh-v1-v081
Node: k8s-agent-fbe89a8-0/10.240.0.6
Start Time: Wed, 28 Mar 2018 07:06:45 +0000
Labels: app=jupyterhub
component=hub
heritage=Tiller
name=hub
pod-template-hash=2186441756
release=sbdh-jh-v1
Annotations: checksum/config-map=322560cbb40c2bcc16b90b83771c5d77d6e36130f04cd9ccfc558f310099b8b5
checksum/secret=a892b5324534e6ae7d81056cca6db0e83ffc4b410a9f9b38a8cb72c6b4dc0324
kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"sbdh-jh-v1-v081","name":"hub-2186441756","uid":"93368284-3256-11e8-8669-000d3a14c...
Status: Running
IP: 10.244.0.39
Controlled By: ReplicaSet/hub-2186441756
Containers:
hub-container:
Container ID: docker://fe7fa7b4f2c6e82f5ff14d0381d0316494ae0644ab09eb13050874d75bd2280a
Image: jupyterhub/k8s-hub:v0.6
Image ID: docker-pullable://jupyterhub/k8s-hub@sha256:04dd7d25d348016ec009647399adcea2833fc53a66e66e0c288d333e7c3aabd8
Port: 8081/TCP
Command:
jupyterhub
--config
/srv/jupyterhub_config.py
--upgrade-db
State: Running
Started: Wed, 28 Mar 2018 07:06:49 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 200m
memory: 512Mi
Environment:
SINGLEUSER_IMAGE: jupyter/datascience-notebook:c19283de5a6f
JPY_COOKIE_SECRET: <set to the key 'hub.cookie-secret' in secret 'hub-secret'> Optional: false
POD_NAMESPACE: sbdh-jh-v1-v081 (v1:metadata.namespace)
CONFIGPROXY_AUTH_TOKEN: <set to the key 'proxy.token' in secret 'hub-secret'> Optional: false
Mounts:
/etc/jupyterhub/config/ from config (rw)
/etc/jupyterhub/secret/ from secret (rw)
/srv/jupyterhub from hub-db-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-bl7t6 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hub-config
Optional: false
secret:
Type: Secret (a volume populated by a Secret)
SecretName: hub-secret
Optional: false
hub-db-dir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: hub-db-dir
ReadOnly: false
default-token-bl7t6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-bl7t6
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
config.yamlhub:
cookieSecret: "<SECRET>"
proxy:
secretToken: "<SECRET>"
prePuller:
hook:
enabled: true
rbac:
enabled: false
auth:
type: github
github:
clientId: "<ID>"
clientSecret: "<SECRET>"
callbackUrl: "http://<FQDN>/hub/oauth_callback"
org_whitelist:
- "<ORG_NAME>"
scopes:
- "read:org"
singleuser:
image:
name: jupyter/datascience-notebook
tag: c19283de5a6f
Thanks for opening this!
The relevant info is under 'Events:'. Unfortunately that gets cleared away after a while. Can you stop this user server, start it again, and paste the describe output again?
@yuvipanda Sure. I just restarted the user server and the Events section still reads <none>. Thoughts?
hmm that's strange. What is the output of 'kubectl --namespace=
get events'?
On Mon, Apr 2, 2018 at 6:03 PM, Aleksandr Blekh notifications@github.com
wrote:
@yuvipanda https://github.com/yuvipanda Sure. I just restarted the user
server and the Events section still reads. Thoughts? —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/628#issuecomment-378095006,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23urFLc57fgSCJLliAbT_HOz8cj2Gks5tksp9gaJpZM4TESc6
.
--
Yuvi Panda T
http://yuvi.in/blog
Here you go:
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
36m 36m 1 jupyter-ablekh.1521c58479cd75a3 Pod Normal Scheduled default-scheduler Successfully assigned jupyter-ablekh to k8s-agent-fbe89a8-0
36m 36m 1 jupyter-ablekh.1521c584883d786e Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "no-api-access-please"
34m 34m 1 jupyter-ablekh.1521c5a11ded3aae Pod Warning FailedMount kubelet, k8s-agent-fbe89a8-0 Unable to mount volumes for pod "jupyter-ablekh_sbdh-jh-v1-v081(f5f5cc28-36d5-11e8-8669-000d3a14c417)": timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"jupyter-ablekh". list of unattached/unmounted volumes=[volume-ablekh]
34m 34m 1 jupyter-ablekh.1521c5a11df1356f Pod Warning FailedSync kubelet, k8s-agent-fbe89a8-0 Error syncing pod
34m 34m 1 jupyter-ablekh.1521c5a2ef7a466b Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "pvc-e736ba97-2f06-11e8-88cb-000d3a14c417"
34m 34m 1 jupyter-ablekh.1521c5a318237af1 Pod spec.initContainers{block-cloud-metadata} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyterhub/k8s-network-tools:v0.6" already present on machine
34m 34m 1 jupyter-ablekh.1521c5a323728be9 Pod spec.initContainers{block-cloud-metadata} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
34m 34m 1 jupyter-ablekh.1521c5a32dd82654 Pod spec.initContainers{block-cloud-metadata} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
34m 34m 1 jupyter-ablekh.1521c5a34c9fafe8 Pod spec.containers{notebook} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyter/datascience-notebook:c19283de5a6f" already present on machine
34m 34m 1 jupyter-ablekh.1521c5a3588f7342 Pod spec.containers{notebook} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
34m 34m 1 jupyter-ablekh.1521c5a36101233d Pod spec.containers{notebook} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
32m 32m 1 jupyter-ablekh.1521c5b965692490 Pod spec.containers{notebook} Normal Killing kubelet, k8s-agent-fbe89a8-0 Killing container with id docker://notebook:Need to kill Pod
32m 32m 1 jupyter-ablekh.1521c5bcbb7dd208 Pod Normal Scheduled default-scheduler Successfully assigned jupyter-ablekh to k8s-agent-fbe89a8-0
32m 32m 1 jupyter-ablekh.1521c5bcc7d8847b Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "no-api-access-please"
30m 30m 1 jupyter-ablekh.1521c5d95f851a7f Pod Warning FailedMount kubelet, k8s-agent-fbe89a8-0 Unable to mount volumes for pod "jupyter-ablekh_sbdh-jh-v1-v081(85ff598b-36d6-11e8-8669-000d3a14c417)": timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"jupyter-ablekh". list of unattached/unmounted volumes=[volume-ablekh]
30m 30m 1 jupyter-ablekh.1521c5d95f876984 Pod Warning FailedSync kubelet, k8s-agent-fbe89a8-0 Error syncing pod
30m 30m 1 jupyter-ablekh.1521c5db24048526 Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "pvc-e736ba97-2f06-11e8-88cb-000d3a14c417"
30m 30m 1 jupyter-ablekh.1521c5db5179d87f Pod spec.initContainers{block-cloud-metadata} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyterhub/k8s-network-tools:v0.6" already present on machine
30m 30m 1 jupyter-ablekh.1521c5db5b4ba35f Pod spec.initContainers{block-cloud-metadata} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
30m 30m 1 jupyter-ablekh.1521c5db67712d8e Pod spec.initContainers{block-cloud-metadata} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
30m 30m 1 jupyter-ablekh.1521c5db7a7031e8 Pod spec.containers{notebook} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyter/datascience-notebook:c19283de5a6f" already present on machine
30m 30m 1 jupyter-ablekh.1521c5db8535a0cf Pod spec.containers{notebook} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
30m 30m 1 jupyter-ablekh.1521c5db8f8b7dec Pod spec.containers{notebook} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
6m 6m 1 jupyter-ablekh.1521c72aa29bf942 Pod spec.containers{notebook} Normal Killing kubelet, k8s-agent-fbe89a8-0 Killing container with id docker://notebook:Need to kill Pod
5m 5m 1 jupyter-ablekh.1521c733b9bac6d4 Pod Normal Scheduled default-scheduler Successfully assigned jupyter-ablekh to k8s-agent-fbe89a8-0
5m 5m 1 jupyter-ablekh.1521c733c95797fc Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "no-api-access-please"
3m 3m 1 jupyter-ablekh.1521c7505dc0dd51 Pod Warning FailedMount kubelet, k8s-agent-fbe89a8-0 Unable to mount volumes for pod "jupyter-ablekh_sbdh-jh-v1-v081(45f766ca-36da-11e8-8669-000d3a14c417)": timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"jupyter-ablekh". list of unattached/unmounted volumes=[volume-ablekh]
3m 3m 1 jupyter-ablekh.1521c7505dc1ef8a Pod Warning FailedSync kubelet, k8s-agent-fbe89a8-0 Error syncing pod
3m 3m 1 jupyter-ablekh.1521c75252c57777 Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "pvc-e736ba97-2f06-11e8-88cb-000d3a14c417"
3m 3m 1 jupyter-ablekh.1521c7527f812f07 Pod spec.initContainers{block-cloud-metadata} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyterhub/k8s-network-tools:v0.6" already present on machine
3m 3m 1 jupyter-ablekh.1521c7528a972d99 Pod spec.initContainers{block-cloud-metadata} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
3m 3m 1 jupyter-ablekh.1521c75298b07718 Pod spec.initContainers{block-cloud-metadata} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
3m 3m 1 jupyter-ablekh.1521c752cbc7b837 Pod spec.containers{notebook} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyter/datascience-notebook:c19283de5a6f" already present on machine
3m 3m 1 jupyter-ablekh.1521c752d5170f75 Pod spec.containers{notebook} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
3m 3m 1 jupyter-ablekh.1521c752dd9272c9 Pod spec.containers{notebook} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c517abfb77c0 Pod Normal Scheduled default-scheduler Successfully assigned pre-puller-1522714852-sbdh-jh-v1-8-f2sj8 to k8s-agent-fbe89a8-0
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c517b8aa162e Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "default-token-bl7t6"
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c517e8dff15b Pod spec.containers{pre-puller} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyterhub/k8s-pre-puller:v0.6" already present on machine
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c517f48a46d8 Pod spec.containers{pre-puller} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c517fde5a9ed Pod spec.containers{pre-puller} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
42m 42m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c534c661940d Pod Warning FailedMount kubelet, k8s-agent-fbe89a8-0 Unable to mount volumes for pod "pre-puller-1522714852-sbdh-jh-v1-8-f2sj8_sbdh-jh-v1-v081(df70a85e-36d4-11e8-8669-000d3a14c417)": timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"pre-puller-1522714852-sbdh-jh-v1-8-f2sj8". list of unattached/unmounted volumes=[default-token-bl7t6]
42m 42m 1 pre-puller-1522714852-sbdh-jh-v1-8-f2sj8.1521c534c6630fbe Pod Warning FailedSync kubelet, k8s-agent-fbe89a8-0 Error syncing pod
44m 44m 1 pre-puller-1522714852-sbdh-jh-v1-8.1521c517aa376644 Job Normal SuccessfulCreate job-controller Created pod: pre-puller-1522714852-sbdh-jh-v1-8-f2sj8
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c548735fd35c Pod Normal Scheduled default-scheduler Successfully assigned pre-puller-1522715062-sbdh-jh-v1-9-q6kdj to k8s-agent-fbe89a8-0
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c54881660ad1 Pod Normal SuccessfulMountVolume kubelet, k8s-agent-fbe89a8-0 MountVolume.SetUp succeeded for volume "default-token-bl7t6"
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c548adb41953 Pod spec.containers{pre-puller} Normal Pulled kubelet, k8s-agent-fbe89a8-0 Container image "jupyterhub/k8s-pre-puller:v0.6" already present on machine
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c548b8080ef0 Pod spec.containers{pre-puller} Normal Created kubelet, k8s-agent-fbe89a8-0 Created container
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c548bf92dc23 Pod spec.containers{pre-puller} Normal Started kubelet, k8s-agent-fbe89a8-0 Started container
38m 38m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c565927a78d2 Pod Warning FailedMount kubelet, k8s-agent-fbe89a8-0 Unable to mount volumes for pod "pre-puller-1522715062-sbdh-jh-v1-9-q6kdj_sbdh-jh-v1-v081(5c5048ef-36d5-11e8-8669-000d3a14c417)": timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"pre-puller-1522715062-sbdh-jh-v1-9-q6kdj". list of unattached/unmounted volumes=[default-token-bl7t6]
38m 38m 1 pre-puller-1522715062-sbdh-jh-v1-9-q6kdj.1521c565927e425b Pod Warning FailedSync kubelet, k8s-agent-fbe89a8-0 Error syncing pod
41m 41m 1 pre-puller-1522715062-sbdh-jh-v1-9.1521c54871c75ec6 Job Normal SuccessfulCreate job-controller Created pod: pre-puller-1522715062-sbdh-jh-v1-9-q6kdj
The events saying
timeout expired waiting for volumes to attach/mount for pod "sbdh-jh-v1-v081"/"jupyter-ablekh"
make me think this is just Azure disk being much slower than Google Cloud's.
I think @ryanlovett has the most experience in this area.
Hmm ... That is strange, because instance type used by ACS by default (Standard D2 v2) has SSD and fast networking: Azure documentation. Thank you for mentioning Ryan - hopefully, he will chime in soon ...
The disk itself might be fast, but it might take a long time to attach /
detach. We had similar problems with ACS (and azure in general) when we
were setting it up, IIRC...
On Mon, Apr 2, 2018 at 6:20 PM, Aleksandr Blekh notifications@github.com
wrote:
Hmm ... That is strange, because the instance used by ACS by default
(Standard D2 v2) has SSD and fast networking: Azure documentation
https://docs.microsoft.com/en-us/azure/cloud-services/cloud-services-sizes-specs#dv2-series.
Thank you for mentioning Ryan - hopefully, he will chime in soon ...—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/628#issuecomment-378097574,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23oMxYQMSKdKhXLLXfQrERdleOzkXks5tks5wgaJpZM4TESc6
.
--
Yuvi Panda T
http://yuvi.in/blog
Hmm ... it seems that Azure and Kubernetes are not playing well with each other in the storage department: this issue seems to be quite illustrative and likely relevant. The whole thing reads like a "War and Peace". :-(
Yeah...
Is there a reason you're using ACS instead of AKS or
https://github.com/Azure/acs-engine?
On Mon, Apr 2, 2018 at 7:25 PM, Aleksandr Blekh notifications@github.com
wrote:
Hmm ... it seems that Azure and Kubernetes are not playing well with each
other in the storage department: this issue
https://github.com/Azure/ACS/issues/12 seems to be quite illustrative
and likely relevant. The whole thing reads like a "War and Peace"
https://en.wikipedia.org/wiki/War_and_Peace. :-(—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/628#issuecomment-378107475,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23pQzGNt55rYcNZeE5gkUsAa4WNhOks5tkt2DgaJpZM4TESc6
.
--
Yuvi Panda T
http://yuvi.in/blog
@yuvipanda The only reason is my impression on the state of readiness of AKS versus ACS. My impression was directly based on feedback I got from an MS Azure cloud architect, who I had a chance to speak with. Specifically, he said the following (I'm quoting below - see next paragraphs). Thinking again, it seems that, perhaps, for the purpose of deploying a JupyterHub cluster, based on the _z2jh_ guide, AKS is quite ready ...
AKS roadmap: The GA is planned during Q2 CY18 and below are the key features that will likely be supported as part of the GA
- Custom VNET (BYO VNET) specifically applies to nodes/agent vm’s in an existing VM.
- Does NOT include:
- Private service endpoint for Kubernetes master API server
- Private AKS clusters NOT exposed to the internet via an Azure LB; available in the 0.14 release of acs-engine though
- Kubernetes RBAC
- Azure CNI + network Policy support
- AKS customer logs, metrics and diagnostic logs backed up in a storage account
- Terraform Support
Features like VNET service endpoint (i.e. Kubernetes API server does not have public FQDN), AAD support to login to the Kubernetes cluster, Multiple agent pools and Windows support will be part of a later release possibly end of year or so. If your use-case hinges on AAD authentication ACS might be a more preferred way to go.
I never got any of these features to work with Azure ACS either :D (except
Kubernetes RBAC, but from your config I see you're disabling that too!).
For the disk attach speeds, I'll recommend looking at
https://azure.microsoft.com/en-us/services/storage/files/. @ryanlovett
might have his recipe handy.
On Mon, Apr 2, 2018 at 7:49 PM, Aleksandr Blekh notifications@github.com
wrote:
@yuvipanda https://github.com/yuvipanda The only reason is my
impression on the state of readiness of AKS versus ACS. My impression was
directly based on feedback I got from an MS Azure cloud architect, who I
had a chance to speak with. Specifically, he said the following (I'm
quoting below - see next paragraphs). Thinking again, it seems that,
perhaps, for the purpose of deploying a JupyterHub cluster, based on the
z2jh guide, AKS is quite ready ...AKS roadmap: The GA is planned during Q2 CY18 and below are the key
features that will likely be supported as part of the GA
- Custom VNET (BYO VNET) specifically applies to nodes/agent vm’s in
an existing VM.
- Does NOT include:
- Private service endpoint for Kubernetes master API server
- Private AKS clusters NOT exposed to the internet via an Azure
LB; available in the 0.14 release of acs-engine though
- Kubernetes RBAC
- Azure CNI + network Policy support
- AKS customer logs, metrics and diagnostic logs backed up in a
storage account- Terraform Support
Features like VNET service endpoint (i.e. Kubernetes API server does not
have public FQDN), AAD support to login to the Kubernetes cluster, Multiple
agent pools and Windows support will be part of a later release possibly
end of year or so. If your use-case hinges on AAD authentication ACS might
be a more preferred way to go.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/628#issuecomment-378111585,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23qMHrDCNEluOwQT4GWp_ZaGOv1UBks5tkuMtgaJpZM4TESc6
.
--
Yuvi Panda T
http://yuvi.in/blog
@yuvipanda I agree, but I just felt that ACS was more stable when I started looking at all this. Now, after reading some information, I have an impression that Microsoft promotes AKS as primary way of running K8s on Azure and most active development seems to be targeting AKS (with ACS being in a maintenance mode). As for acs-engine, it is definitely an interesting project, but still seems to me not quite production ready.
Oooh @ablekh and @yuvipanda
This was very relevant for me to learn about, thank you for sharing!
Closing this as we don't think this is related to the chart, right?
Most helpful comment
The disk itself might be fast, but it might take a long time to attach /
detach. We had similar problems with ACS (and azure in general) when we
were setting it up, IIRC...
On Mon, Apr 2, 2018 at 6:20 PM, Aleksandr Blekh notifications@github.com
wrote:
--
Yuvi Panda T
http://yuvi.in/blog