Microk8s: Unable to enable kubeflow using channel 1.16/edge/kubeflow

Created on 24 Oct 2019  路  32Comments  路  Source: ubuntu/microk8s

microk8s_kubeflow.txt
Please run microk8s.inspect and attach the generated tarball to this issue.

We appreciate your feedback. Thank you for using microk8s.
I am using the 1.16/edge/kubeflow channel and when i try microk8s.enable kubeflow the command hangs at this step

 microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling juju...
Deploying Kubeflow...

Any suggestion on how can i troubleshoot this ?

All 32 comments

Hi @praveen049

Could you attach here the tarball created by microk8s.inspect. Could you also share the logs you get with microk8s.juju debug-log -n 2000. @knkski can you think of anything else it could help us figure out what might be wrong here?

@ktsakalozos

Attaching it here.
inspection-report-20191024_054807.tar.gz

microk8s.juju debug-log -n 2000 gives ERROR Gateway Timeout.

I am behind a proxy and no_proxy is set correctly.

@ktsakalozos

Any suggestions on how i can troubleshoot this issue ?

Thanks

@praveen049: Are you able to run any microk8s.juju commands at all? Can you post microk8s.juju status if it runs successfully?

@knkski
i have tried couple of commands
microk8s.juju status and microk8s.juju users and they both hang

@praveen049: Can you try microk8s.juju status --debug and see if you get any output? Otherwise, can you post the logs from the juju controller pod?

@knkski

Output of microk8s.juju status -debug

(base) sims@kubeflow:~$ microk8s.juju status --debug
04:47:50 INFO聽 juju.cmd supercommand.go:79 running juju [2.7-rc1聽 gc go1.10.4]
04:47:50 DEBUG juju.cmd supercommand.go:80 聽 args: []string{"/var/snap/microk8s/946/bin/juju", "status", "--debug"}
04:47:50 INFO聽 juju.juju api.go:67 connecting to API addresses: [10.152.183.246:17070]

The output of kubectl describe pods -n controller-uk8s is attached
juju_pod2.txt

@praveen049: Can you also post the logs from that pod? It looks like it's running normally.

@praveen049: If nothing else, can you try microk8s.disable kubeflow, or microk8s.juju unregister -y uk8s if that doesn't work, then trying microk8s.enable kubeflow again?

@knkski
i have reinstalled microk8s from 1.16/edge/kubeflowchannel. Previously it was installed with 1.14/stable and then switched channel to 1.16/edge/kubeflow

Attached are the logs from the mongodb and api-server pods
apiserver.txt
mongodb.txt

This time the microk8s.enable kubeflow returns with the below error

(base) sims@trainer:~$ microk8s.enable kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling juju...
Deploying Kubeflow...
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
Downloading images
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.89 to verify accessibility...
ERROR unable to contact api server after 1 attempts: Gateway Timeout

Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow 

And trying the disable and unregister does not help

(base) sims@kubeflow:~$ microk8s.disable kubeflow
ERROR controller uk8s not found

Command '('microk8s-juju.wrapper', 'destroy-controller', '-y', 'uk8s', '--destroy-all-models', '--destroy-storage')' returned non-zero exit status 1
Failed to disable kubeflow
(base) sims@kubeflow:~$ microk8s.juju unregister -y uk8s
ERROR controller uk8s not found

@praveen049: Could you try updating the snap (sudo snap refresh microk8s), and running KUBEFLOW_DEBUG=true microk8s.enable kubeflow? That will add in the --debug flag to Juju, which should help diagnose what's going on here.

@knkski
Attached is the log with the debug option.
juju_status_error_debug.txt

@knkski
Any pointers on how to troubleshoot and fix this issue ?

Thanks

@praveen049: Sorry about the wait. It looks like you've got a proxy issue. Can you either try it without the proxy involved, or post the output from this command?

microk8s.juju --debug bootstrap microk8s --config juju-no-proxy=10.0.0.1

@knkski thank you for the feedback.

Attached is the output
juju-noproxy.txt

the commands i used:
KUBEFLOW_DEBUG=true microk8s.enable kubeflow
This fails as before and then

microk8s.juju --debug bootstrap microk8s --config juju-no-proxy=10.0.0.1

Thanks

@praveen049: It looks like the manual bootstrap command worked for you, so I've added in the flag that should fix things for you in PR #785.

@knkski thank you for the fix.

so, i need to deploy again from the channel and enable kubeflow with the below commands ?

sudo snap install microk8s --classic --channel 1.16/edge/kubeflow
microk8s.enable kubeflow

@knkski
Based on the discussion on the thread for PR 785, it seems the fix was not merged. Is there any other solution or workaround to get it working ?

Thanks

installing from 1.16/edge/kubeflow channel worked for me 2 days ago

sudo snap install microk8s --classic --channel 1.16/edge/kubeflow
microk8s.enable kubeflow

but now i am getting this error right now

KUBEFLOW_DEBUG=true sudo  microk8s.enable  kubeflow
Enabling dns...
Enabling storage...
Enabling dashboard...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Located bundle "cs:bundle/kubeflow-134"
ERROR cannot deploy bundle: the provided bundle has the following errors:
empty charm path
invalid charm URL in application "ambassador-auth": cannot parse URL "": name "" not valid

Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpnmhsn4l0')' returned non-zero exit status 1
Failed to enable kubeflow

@charlesa101: Apologies, can you run sudo snap switch microk8s --channel edge && sudo snap refresh? I think that particular channel is no longer getting updated and will disappear due to the feature getting merged into master.

@knkski
I have now tried with the new channel and the proxy issue seems to be resolved. Thank for that.

I have running into a different error :

`Resolving charm: cs:~kubeflow-charmers/seldon-cluster-manager-47
Resolving charm: cs:~kubeflow-charmers/tensorboard-46
Resolving charm: cs:~kubeflow-charmers/tf-job-dashboard-48
Resolving charm: cs:~kubeflow-charmers/tf-job-operator-46
ERROR cannot deploy bundle: cannot add charm "cs:~kubeflow-charmers/ambassador-47": cannot retrieve charm "cs:~kubeflow-charmers/ambassador-47": cannot get archive: Get https://api.jujucharms.com/charmstore/v5/~kubeflow-charmers/ambassador-47/archive?channel=stable: dial tcp: lookup api.jujucharms.com on 10.152.183.10:53: server misbehaving

Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpu9t_xu6d')' returned non-zero exit status 1
Failed to enable kubeflow`

Attached is the full logs
microk8s-19Nov.txt

Any pointers on how to resolve this ?

Thanks

@charlesa101
are you able to deploy Kubeflow from the edge channel ?

@knkski

Any suggestions on how to troubleshooting this issue ?

@praveen049 yea i was able to get this running from the edge channel

but before then i had clean up my snap directory

sudo snap switch microk8s --channel edge && sudo snap refresh like @knkski said

then microk8s enable storage dns rbac juju kubeflow did it for me

@charlesa101 thanks for the info

But these commands are not working for me and it's failing when deploy kubeflow with below error

ERROR cannot deploy bundle: cannot add charm "cs:~kubeflow-charmers/ambassador-47": cannot retrieve charm "cs:~kubeflow-charmers/ambassador-47": cannot get archive: Get https://api.jujucharms.com/charmstore/v5/~kubeflow-charmers/ambassador-47/archive?channel=stable: dial tcp: lookup api.jujucharms.com on 10.152.183.10:53: server misbehaving

Command '('microk8s-juju.wrapper', 'deploy', 'kubeflow', '--channel', 'stable', '--overlay', '/tmp/tmpu9t_xu6d')' returned non-zero exit status 1
Failed to enable kubeflow

I am running behind a proxy and seems to be some issue related to that.

@praveen049: Yeah, that could definitely be a proxy issue. @ktsakalozos, do you know how we should handle that?

@praveen049: Can you post the output from KUBEFLOW_DEBUG=true microk8s.enable kubeflow? That should output some more useful information

@knkski

Attaching the debug output
microk8s-debug-27Nov.txt

@wallyworld, how would we put the pods subnet in no-proxy? Looking at https://discourse.jujucharms.com/t/configuring-models/1151 no-proxy does not take a CIDR notation so it can not fit a /16 network. What about juju-no-proxy? Could we use that one?

@ktsakalozos @wallyworld @knkski

Hi, Any suggestions on how to address this proxy issue ?

Thanks

I think juju-no-proxy may work, but in practice it can be hit and miss depending on the environment in which stuff is running.

Just wanted to add in case anybody got here from Google that in my case, the problem was that I had a folder kubeflow in my home directory from a previous installation (now on 1.17/stable) and the juju command was therefore ambiguous between cs:kubeflow and my local folder. I found this by setting KUBEFLOW_DEBUG=true and saw this message:

/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/application/deploy.go:1340: The charm or bundle "kubeflow" is ambiguous.

Therefore, I just change to a different directory to run and that fixed it, Kubeflow then deployed perfectly.

Was this page helpful?
0 / 5 - 0 ratings