Microk8s: Unable to login into KubeFlow after the addon is enabled

Created on 21 Nov 2019  路  26Comments  路  Source: ubuntu/microk8s

inspection-report-20191121_130543.tar.gz

Installed microk8s on a 32-core/64GB workstation running 18.04LTS latest. The snap install and addon enablement went fine:

snap install microk8s --classic
microk8s.enable registry dns rbac storage knative kubeflow istio
...all worked...

KubeFlow is at 0.6.0-rc0

At the end of install, I see that KubeFlow is up and running (all objects in kubeflow namespace are RUNNING).

However, I have no way to login. The best I could come up with is:

$ microk8s.kubectl port-forward svc/ambassador -n kubeflow 8080:80

Which does seem to trigger Ambassador and redirect me to the https://localhost:8080/kflogin page. But at that point the SSL connection fails with either:

SSL_ERROR_RX_RECORD_TOO_LONG (Firefox)
ERR_SSL_PROTOCOL_ERROR (Chrome)

I assume a self-signed certificate is installed via the microk8s addon process.

Again, the dashboard is definitely up since I can port-forward to it directly but then all links off of that page are 404 (I suspects the redirects are screwed up because of the direct connection to 8082). Help!

Most helpful comment

I am re-opening this since it seems there more to be discussed.

The issue was closed automatically by the merge of the referenced PR.

All 26 comments

Update: One little tidbit I forgot to mention is that the original enablement of KubeFlow timed out as my pods were initializing (it just takes TIME for me to download the images and the pods to come up to a RUNNING state).

My guess is the ingress it normally creates didn't happen because of it. But that seems still like a bug.

Were you able to manually fix this then? I'm getting the same error and from what I can tell everything was able to get started w/o a timeout, though I still get SSL_ERROR_RX_RECORD_TOO_LONG.

Were you able to manually fix this then? I'm getting the same error and from what I can tell everything was able to get started w/o a timeout, though I still get SSL_ERROR_RX_RECORD_TOO_LONG.

There was A LOT of manual intervention to get this all to work right.

I am re-opening this since it seems there more to be discussed.

The issue was closed automatically by the merge of the referenced PR.

yeah I am having the same error as well

@knkski any idea?

@charlesa101, @tshauck: If the deploy process timed out previously, you won't have the necessary patch to the Ingress for Ambassador. Can you see if microk8s.kubectl get -n kubeflow ingress -oyaml shows a tls section? A missing tls section will be responsible for the SSL_ERROR_RX_RECORD_TOO_LONG error.

If not, can you try one of these:

  1. Making sure you're on the latest microk8s snap and doing microk8s.disable kubeflow && microk8s.enable kubeflow? That removes the timeout, so should properly patch in the tls section

  2. Manually patch in the tls section by passing this in to microk8s.kubectl apply:

    {"kind": "Ingress", "apiVersion": "extensions/v1beta1", "metadata": {"name": "ambassador", "namespace": "kubeflow"}, "spec": {"tls": [{"hosts": ["localhost"], "secretName": "ambassador-tls"}]}}
    

@knkski thanks for the response...

The output of the the get ingess command:

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    annotations:
      ingress.kubernetes.io/rewrite-target: ""
      ingress.kubernetes.io/ssl-passthrough: "false"
      ingress.kubernetes.io/ssl-redirect: "false"
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{},"name":"ambassador","namespace":"kubeflow"},"spec":{"tls":[{"hosts":["localhost"],"secretName":"ambassador-tls"}]}}
      kubernetes.io/ingress.allow-http: "false"
      kubernetes.io/ingress.class: nginx
    creationTimestamp: "2019-11-28T21:38:28Z"
    generation: 5
    labels:
      juju-controller-uuid: f7957347-4c03-42dc-8bc7-4920b7605993
      juju-model-uuid: 9fc33d74-abfd-488c-83e2-c27858204cae
    name: ambassador
    namespace: kubeflow
    resourceVersion: "99139"
    selfLink: /apis/extensions/v1beta1/namespaces/kubeflow/ingresses/ambassador
    uid: aca579ad-a535-4bd5-999f-add373b266e1
  spec:
    rules:
    - host: localhost
      http:
        paths:
        - backend:
            serviceName: ambassador
            servicePort: 80
          path: /
    tls:
    - hosts:
      - localhost
      secretName: ambassador-tls
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

It looks like it does have a tls section from what I can tell. Manually patching with the second command returns the same things.

If helpful my port forward command and some output...

thauck@mmm:~$ microk8s.kubectl port-forward --address 0.0.0.0 -n kubeflow svc/ambassador 8080:80
Forwarding from 0.0.0.0:8080 -> 80
Handling connection for 8080
Handling connection for 8080
E1128 21:51:19.533905   20901 portforward.go:385] error copying from local connection to remote stream: read tcp4 172.16.0.12:8080->172.16.0.9:56972: read: connection reset by peer
56972: read: connection reset by peer

I've also attached the output from microk8s.inspect...
inspection-report-20191128_215323.tar.gz

@tshauck: You shouldn't need to run any port forwarding. After you've enabled Kubeflow, you should be able to just go to https://localhost/ to get to the dashboard. The Kubeflow dashboard requires HTTPS to be enabled, and all of the certificate handling happens inside of Kubernetes via Ingress resources.

I have the same issues.
@knkski
Patching TLS is not necessary. If I patch it nothing changes because it already exists.
curl on port 80 returns:
user@not-microk8s-machine:~$ curl -k https://192.168.178.51:80 -v

* Rebuilt URL to: https://192.168.178.51:80/
*   Trying 192.168.178.51...
* TCP_NODELAY set
* Connected to 192.168.178.51 (192.168.178.51) port 80 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* error:1408F10B:SSL routines:ssl3_get_record:wrong version number
* stopped the pause stream!
* Closing connection 0
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number

To be honest I am not sure if this is a microk8s issue because if I do:
microk8s-user@definitely-microk8s-machine:~$ curl -k https://localhost -v
returns:

*   Trying 127.0.0.1:443...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  start date: Jan 29 12:39:37 2020 GMT
*  expire date: Jan 28 12:39:37 2021 GMT
*  issuer: O=Acme Co; CN=Kubernetes Ingress Controller Fake Certificate
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55c82857b1d0)
> GET / HTTP/2
> Host: localhost
> User-Agent: curl/7.65.3
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 307 
< server: openresty/1.15.8.1
< date: Wed, 29 Jan 2020 17:42:46 GMT
< content-type: text/html; charset=utf-8
< content-length: 61
< location: https://localhost/kflogin
< x-envoy-upstream-service-time: 1
< strict-transport-security: max-age=15724800; includeSubDomains
< 
<a href="https://localhost/kflogin">Temporary Redirect</a>.

* Connection #0 to host localhost left intact

Which seems fine (apart from downgrading to TLS 1.2).

Hi!

I was able to login to Kubeflow once following https://github.com/kubeflow/website/blob/master/content/docs/other-guides/virtual-dev/getting-started-multipass.md#install-kubeflow-using-microk8s but after machine reboot, https://localhost/kflogin reports ERR_SSL_PROTOCOL_ERROR in Chrome and SSL_ERROR_INTERNAL_ERROR_ALERT in Firefox

After reboot I just typed

$ microk8s.status --wait-ready

microk8s is running
addons:
cilium: disabled
dashboard: enabled
dns: enabled
fluentd: disabled
gpu: disabled
helm3: disabled
helm: disabled
ingress: enabled
istio: disabled
jaeger: disabled
juju: enabled
knative: disabled
kubeflow: enabled
linkerd: disabled
metallb: disabled
metrics-server: disabled
prometheus: disabled
rbac: enabled
registry: disabled
storage: enabled

Feel free to ask extra details which can help you.

Hello,
Same error for me:

samy@cube:~$ multipass info kubeflow-vm 
Name:           kubeflow-vm
State:          Running
IPv4:           10.104.38.12
Release:        Ubuntu 18.04.3 LTS
Image hash:     ae2c9391b71a (Ubuntu 18.04 LTS)
Load:           1.49 2.24 2.34
Disk usage:     32,4G out of 38,6G
Memory usage:   4,4G out of 15,7G
samy@cube:~$ curl -k -v https://10.104.38.12:80
* Rebuilt URL to: https://10.104.38.12:80/
*   Trying 10.104.38.12...
* TCP_NODELAY set
* Connected to 10.104.38.12 (10.104.38.12) port 80 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* error:1408F10B:SSL routines:ssl3_get_record:wrong version number
* stopped the pause stream!
* Closing connection 0
curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number

With services configuration:

ubuntu@kubeflow-vm:~$ microk8s.status
microk8s is running
addons:
cilium: disabled
dashboard: enabled
dns: enabled
fluentd: disabled
gpu: disabled
helm3: disabled
helm: disabled
ingress: enabled
istio: disabled
jaeger: disabled
juju: enabled
knative: disabled
kubeflow: enabled
linkerd: disabled
metallb: disabled
metrics-server: disabled
prometheus: disabled
rbac: enabled
registry: disabled
storage: enabled

@pgronau: The issue you're seeing is probably related to https://192.168.178.51:80, it's trying to use HTTPS on an HTTP port. Also, microk8s.enable kubeflow defaults to an external hostname of localhost, if you want to use a particular IP, you'll have to run microk8s.kubectl edit -n kubeflow ingress/ambassador and change spec.rules[0].host to the <YOUR IP HERE>.xip.io (the .xip.io is required due to Kubernetes not allowing straight IP addresses for ingresses). I'll open up a PR that will allow you to set the external hostname during microk8s.enable kubeflow.

@dkurt: I believe I'm able to reproduce your issue. Doing sudo snap stop microk8s && sudo snap start microk8s (or probably also rebooting the host in your case) causes the ingress to lose the TLS section. Can you confirm this by pasting the output from microk8s.kubectl get -n kubeflow ingress/ambassador -oyaml? I'm investigating as to why that is.

@SamChb72: The SSL error you're getting is from trying to talk HTTPS on an HTTP port. Does using https://10.104.38.12 make the SSL error go away? You may still get a 404 not found error, in which case you'll have to edit spec.rules[0].host in ingress/ambassador to use 10.104.38.12.xip.io

Hello @knkski, I replace localhost everywhere in the file and it worked :-)

@dkurt: #965 should fix the issue that you're running into with cluster restart. Until that's merged and released, you can manually edit ingress/ambassador to add in a TLS section as shown here:

https://github.com/ubuntu/microk8s/blob/8ec87c8/microk8s-resources/actions/enable.kubeflow.sh

That will require that you edit it each time after you restart the cluster. Otherwise, you can copy ingress/ambassador under a different name, run microk8s.juju unexpose ambassador, and any edits you make to it will be persisted across reboots.

@SamChb72: Good to hear! #965 will make localhost the default value, but allows setting a KUBEFLOW_HOSTNAME environment variable for microk8s.enable kubeflow that will override localhost.

@knkski, Thanks!

Hello, I am trying to change spec.rules[0].host but I get:
spec.rules[0].host: Invalid value: "192.168.XXX.XXXX": must be a DNS name, not an IP address

@ricpet: Yeah, that's a limitation of how Ingresses work, you'll need to use an address such as 192.168.xxx.xxx.xip.io, which is a service that resolves DNS to the prefixed IP address contained in the URL.

@knkski thanks, you are right...I made the change but still no dashboard... should I re-enable kubeflow?

@ricpet: How are you accessing the dashboard, and what error are you getting? And it won't hurt to re-enable kubeflow if you run into any issues

@knkski actually in the browser I was not adding .xip.io to the IP address of the machine. Now it works... thanks a lot for your help

@SamChb72: The SSL error you're getting is from trying to talk HTTPS on an HTTP port. Does using https://10.104.38.12 make the SSL error go away? You may still get a 404 not found error, in which case you'll have to edit spec.rules[0].host in ingress/ambassador to use 10.104.38.12.xip.io

@knkski After making these changes, do I need to restart microk8s or re-enable kubeflow? Additionally, if I am running microk8s with kubeflow enabled on a server via ssh from my workstation, do I need to do any additional port-forwarding to access the dashboard from my workstation? I am getting the same 404 issues presently. Thanks!

@TomLaMantia: Kubeflow 1.0 came with the release of MicroK8s 1.18, and removes the need to manually edit ingresses. If you haven't yet updated, run sudo snap refresh microk8s. You may encounter issues with proxying the dashboard via SSH, the preferred method would be this:

# First disable kubeflow if you have it enabled
microk8s.disable kubeflow
# Then, set the hostname to use when enabling kubeflow
KUBEFLOW_HOSTNAME=my-other-machine's-hostname-here microk8s.enable kubeflow

For example, I have a beefy machine on my network with DNS set up as foo.lan, I SSH over to it and run KUBEFLOW_HOSTNAME=foo.lan microk8s.enable kubeflow, then access it via foo.lan in my browser.

Marking this issue as complete since the original issue should be fixed

Was this page helpful?
0 / 5 - 0 ratings

Related issues

termie picture termie  路  5Comments

linxuyalun picture linxuyalun  路  3Comments

toxsick picture toxsick  路  4Comments

ceefour picture ceefour  路  3Comments

carmine picture carmine  路  4Comments