Thanks to @ryanlovett's work we now have Azure AKS insructions (rather than the more flake ACS instructions). We should streamline it and run through it several times to make sure it works properly.
For various reasons I decided to spend this weekend creating an AKS based JupyterHub cluster for a UW class, and so now have developed expertise! Shall make a PR soon.
This was merged as #412 and #411 and #359.
I think ideally we should have someone else other than me and @ryanlovett test this once and we can mark this done.
I'm waiting for my azure account to get credits via DSEP then I'll give it
On Tue, Jan 16, 2018 at 4:41 PM Yuvi Panda notifications@github.com wrote:
This was merged as #412
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/412 and #411
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/411 and #359
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/359.I think ideally we should have someone else other than me and @ryanlovett
https://github.com/ryanlovett test this once and we can mark this done.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-358156197,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABwSHcOpcre5lfbyyAgyDwLR9rJalIwqks5tLUGmgaJpZM4Rd6_T
.
@choldgraf by 'waiting' is there something in process? Or? I have rights to add you to that account too I think.
I got added to the account but somehow it seems the $$$ wasn't connected with my username...@ryanlovett and I are trying to find a time to meet tomorrow
Things to improve:
install link doesn't mention using pip, which IME is much easier than apt-installing with a new repository etc.az provider register step takes a long time, we should mention this.ssh-keygen -f ssh-key-<cluster-name> note that it will print something to terminal, and that it'll also create files. We should mention this, and also probably ask people to create folder for the files.cluster-name point, and maybe have a specific step where we ask people to come up w/ a <cluster-name> that we'll refer to throughout the steps.BUG:
kubectl get node I got the following error: Unable to connect to the server: dial tcp: lookup jhub-jhub-bcf6c7-7079d3dd.hcp.eastus.azmk8s.io on 192.168.1.1:53: no such host@ryanlovett @yuvipanda gave this another shot and got farther this time, but ran into the bug detailed at the end of the comment above...
@choldgraf that bug is better! Give it 3-5 more minutes and see how that goes? also did you still get the 'Directory permission is needed' error?
nope I got past that part when using your subscription
@choldgraf I don't think we should specifically mention any installation method that they don't support. az has been pretty unstable and changing a lot, so we should just link there.
Azure specific teardown instructions:
az aks delete --resource-group <resource-group-name> --name <cluster-name> --output table
FYI get node is still returning the same error.
Does the portal show you anything useful?
mmmm, it's saying that jhub is a resourcegroup that exists :-)
Remember to be careful, there's another resourcegroup there called 'jupyterhub' that is actively used by a class, do not delete or mess with that accidentally :)
yeah I just realized that once I looked at the ResourceGroup page - I didn't realize that account was being used by other stuff, so I'll keep my names to holdgraf-XXX from now on
Looks like the jhub cluster is DOA! Do you wanna delete it and try again? (Careful to delete the right one)
lol hooray!
can you tell me the page you looked at to know this?
I didn't do anything, I just tried to 'kubectl get node' myself. It's been long enough that i consider it DOA :)
hehe ok
still running...
Tried this again, got further this time but I missed the you should skip 2 of the RBAC command. Is it possible to recover if they don't skip it and run the command in step 2?
Related perhaps, the helm install command resulted in
Error: clusterroles.rbac.authorization.k8s.io "pre-puller-holdgraf-test-1" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:tiller c14254ed-fd59-11e7-b71c-0a58ac1f141b [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[clusterroles.rbac.authorization.k8s.io "cluster-admin" not found]
Yes, if you just run helm init --upgrade that should work
On Fri, Jan 19, 2018 at 12:52 PM, Chris Holdgraf notifications@github.com
wrote:
Tried this again, got further this time but I missed the you should skip
2 of the RBAC command. Is it possible to recover if they don't skip it
and run the command in step 2?Related perhaps, the helm install command resulted in
Error: clusterroles.rbac.authorization.k8s.io "pre-puller-holdgraf-test-1" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:tiller c14254ed-fd59-11e7-b71c-0a58ac1f141b [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[clusterroles.rbac.authorization.k8s.io "cluster-admin" not found]
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359085736,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23q2Fxnr6ILUpa5llegqqIcd5P5smks5tMQCFgaJpZM4Rd6_T
.
--
Yuvi Panda T
http://yuvi.in/blog
Hmm - that didn't seem to work. I got "happy helming" but then got the same error w/ the helm install command.
Error: clusterroles.rbac.authorization.k8s.io "pre-puller-holdgraf-test-1" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:tiller c14254ed-fd59-11e7-b71c-0a58ac1f141b [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[clusterroles.rbac.authorization.k8s.io "cluster-admin" not found]
Also, opened https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/426 with preliminary improvements to instructions (still need to get the darn thing working though)
Try reversing just step 2 with:
kubectl delete clusterrolebinding tiller
Just ran that, ran helm init --upgrade, have same error:
Error: clusterroles.rbac.authorization.k8s.io "pre-puller-holdgraf-test-1" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:tiller c14254ed-fd59-11e7-b71c-0a58ac1f141b [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[]
this kubectl get clusterrolebinding --namespace=kube-system doesn't list any tiller account 😕
and for good measure:
kubectl get clusterrolebindings
Out :
NAME AGE
pre-puller-holdgraf-test-1 2m
I've tried this again from scratch (deleting the resourcegroup, not following step 2 of the instructions, etc) and I'm getting the same error. I will try and figure this out next week again...
I just tried this again using the Azure interactive CLI, and ran into the same error. I think fixing this is going to require debugging from somebody more knowledgeable about RBAC / Azure than myself. I'm surprised that you two both got this to work since I have now tried and failed 3 times :-/
@choldgraf sorry to hear that! And thank you for trying!
Are you getting this error: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359088283 or https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359099340?
here's the error:
Error: clusterroles.rbac.authorization.k8s.io "pre-puller-jhub-holdgraf-1" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["nodes"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:tiller f26d753c-ffa8-11e7-a27a-0a58ac1f11a0 [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[]
on the + side, it looks like the Azure interactive shell works relatively well
Can you give me the exact sequence of helm commands you executed?
The exact sequence is documented in the PR I made:
(skipping step 2)
Just to verify, you skipped running 'kubectl create clusterrolebinding
tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller' but
did everything else, right?
1.
2.
On Mon, Jan 22, 2018 at 11:31 AM, Chris Holdgraf notifications@github.com
wrote:
(skipping step 2)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359538126,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23lLrbvgkjZM1xrxNWMqDhaegN9c0ks5tNOIWgaJpZM4Rd6_T
.
--
Yuvi Panda T
http://yuvi.in/blog
Could this be related to the fact that we don't do the azure equivalent of #7 from the google instructions? (giving yourself admin status)
yes, the only command that I did not run was:
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
I think a cluster I just created is DOA, I filed https://github.com/Azure/AKS/issues/146
@choldgraf I followed them and it worked. Is the following present in your config.yaml:
rbac:
enabled: false
? It's in the instructions at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/426/files#diff-6b237b4b3d55964c251c52afe6e3c1d9R267 but is easy to miss maybe? That would also explain the error you are having.
ohhh damn, I totally missed that. I assumed that the link just linked to the section in install-helm and didn't think to click it. We should write that out explicitly I think.
Trying this w/ the fix now
ok that got me past the error, though jupyterhub has been "installing" for like 15 minutes now...is it normal to take that long on azure?
I see that there's a pod called pull-all-nodes etc, and its logs look like:
sh: 0: unknown operand
Pulled of 0 nodes
Pulled of 0 nodes
sh: 0: unknown operand
Pulled of 0 nodes
sh: 0: unknown operand
sh: 0: unknown operand
Pulled of 0 nodes
What's the commandline you are using to install? Which version of the
chart? Try with the latest from jupyterhub.github.io/helm-chart, which is
v0.6.0-be077bf
https://jupyterhub.github.io/helm-chart/jupyterhub-v0.6.0-be077bf.tgz.
Should be 0.6.0 when it is released
On Mon, Jan 22, 2018 at 2:55 PM, Chris Holdgraf notifications@github.com
wrote:
ok that got me past the error, though jupyterhub has been "installing" for
like 15 minutes now...is it normal to take that long on azure?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359599049,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23hjM7i5YCQQyV1dTxJpG37vypacfks5tNRHUgaJpZM4Rd6_T
.
--
Yuvi Panda T
http://yuvi.in/blog
It looks like you're using v0.5 of the chart, which won't work.
On Mon, Jan 22, 2018 at 2:57 PM, Yuvi Panda yuvipanda@gmail.com wrote:
What's the commandline you are using to install? Which version of the
chart? Try with the latest from jupyterhub.github.io/helm-chart, which is
v0.6.0-be077bf
https://jupyterhub.github.io/helm-chart/jupyterhub-v0.6.0-be077bf.tgz.
Should be 0.6.0 when it is releasedOn Mon, Jan 22, 2018 at 2:55 PM, Chris Holdgraf notifications@github.com
wrote:ok that got me past the error, though jupyterhub has been "installing"
for like 15 minutes now...is it normal to take that long on azure?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-359599049,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23hjM7i5YCQQyV1dTxJpG37vypacfks5tNRHUgaJpZM4Rd6_T
.--
Yuvi Panda T
http://yuvi.in/blog
--
Yuvi Panda T
http://yuvi.in/blog
ah, yes you're right, didn't realize that this only worked on 0.6...trying now
woot! ok that worked...it took way longer but it got there eventually. Will update the PR with latest changes
Thanks for all of the work on the Azure documentation. I had a trouble getting AKS to launch (seemed maybe capacity related?) but I will try it again.
@jkuruzovich make sure that you follow the the instructions in #426 , and that you use the helm chart 0.6 (will be released this or next week)!
@yuvipanda @choldgraf I'm running through deploying on AKS on 0.5 so I can upgrade to 0.6. Running into a few errors. Wiping and starting from scratch. Will add a WIP PR for some small doc errors.
that would be super appreciated - I was new to azure when I wrote that PR so I'm sure there are some errors in there :-)
It's actually very good @choldgraf. The doc changes are minor. The bigger issue I was having was some helm and tiller weirdness/errors. Starting over.
The bigger issue I was having was some helm and tiller weirdness/errors
story of my life over the last year :-P
AKS will probably not work with 0.5 at all, so I think it's ok to not test
upgrades there.
On Tue, Jan 30, 2018 at 2:24 PM, Chris Holdgraf notifications@github.com
wrote:
The bigger issue I was having was some helm and tiller weirdness/errors
story of my life
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/410#issuecomment-361755294,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB23qb5KN5SPkhSxelVeiZwwlAdxowUks5tP5aOgaJpZM4Rd6_T
.
--
Yuvi Panda T
http://yuvi.in/blog
Well that would explain it @yuvipanda. Nothing like troubleshooting to get one better at kubectl, helm, and friends. Should we recommend for AKS for folks to do a clean install for 0.6?
This is done, I think @willingc and @choldgraf and @ryanlovett did the majority of the work here! <3
@yuvipanda thanks so much for pulling together these instructions. @captainsafia and I are going to investigate Binder + AKS for a project here at Microsoft. Is this a reasonably well-trodden path by other folks in your community? We are happy to update docs etc. as we do our investigation. Thx!
@jflam While we've put effort into getting these instructions out, we really would welcome more contributions from those that use AKS on a regular basis. Always fantastic to get PRs from long-time contributor @captainsafia (you are lucky to have her on your team). :tada:
Oh we know we are lucky to have @captainsafia on our team! Thx!
@jflam it would be awesome to see some cycles going in to improving these instructions and finding pain-points that we can improve upon, thanks you two :-)
@yuvipanda and @ryanlovett have also spent quite a bit of time working with Kubernetes in Azure (not sure if it was the K8S service or not) for our courses at Berkeley, and might be interested in hearing about your experiences
I’ll see if I can take a stab at running through these instruction sometime this weekend and open small PRs as I go along.
Thanks to everyone for getting these docs going!
Most helpful comment
@yuvipanda thanks so much for pulling together these instructions. @captainsafia and I are going to investigate Binder + AKS for a project here at Microsoft. Is this a reasonably well-trodden path by other folks in your community? We are happy to update docs etc. as we do our investigation. Thx!