Che: Unable to start Che by chectl with default TLS - keycloak certificate could not be validated

Created on 19 Mar 2020  Â·  9Comments  Â·  Source: eclipse/che

Describe the bug

After default TLS was added to chectl, I'm unable to start Che. All pods start, but Che pod is stuck in infinite restart loop. Logs reveal

1) Error injecting constructor, java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.10.48.188.27.nip.io/auth/realms/che/.well-known/openid-configuration
Caused by: java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.10.48.188.27.nip.io/auth/realms/che/.well-known/openid-configuration
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

Che version

  • [ ] latest
  • [X] nightly
  • [ ] other: please specify

Steps to reproduce

  • Start Che using chectl server:start --multiuser --platform=minikube
  • Wait till startup fails - che pod never gets ready
  • Eclipse Che pod bootstrap
    ✔ scheduling...done.
    ✔ downloading images...done.
    ✖ starting
    → ERR_TIMEOUT: Timeout set to pod ready timeout 130000
    Retrieving Eclipse Che server URL
    Eclipse Che status check
    › Error: Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
    › Installation failed, check logs in '/tmp/chectl-logs/1584613990270'

    1. Che pods never gets ready - stuck in infinite restarts loop

    [root@czprapd-chenext ~]# kubectl get all --all-namespaces
    NAMESPACE NAME READY STATUS RESTARTS AGE
    che pod/che-5968b96bfd-smnnj 0/1 Running 1 7m11s
    che pod/che-operator-844f4bd4f9-jvvsf 1/1 Running 0 9m3s
    che pod/devfile-registry-79945cb69f-2bjpk 1/1 Running 0 7m34s
    che pod/keycloak-56c8774bbf-qp59p 1/1 Running 0 8m33s
    che pod/plugin-registry-658bb57ff5-fqj5g 1/1 Running 0 7m21s
    che pod/postgres-984c4cd5c-t2nr8 1/1 Running 0 8m59s
    kube-system pod/coredns-6955765f44-5h9wz 1/1 Running 0 13m
    kube-system pod/coredns-6955765f44-zfnwm 1/1 Running 0 13m
    kube-system pod/etcd-czprapd-chenext 1/1 Running 0 13m
    kube-system pod/kube-apiserver-czprapd-chenext 1/1 Running 0 13m
    kube-system pod/kube-controller-manager-czprapd-chenext 1/1 Running 0 13m
    kube-system pod/kube-proxy-qb8vz 1/1 Running 0 13m
    kube-system pod/kube-scheduler-czprapd-chenext 1/1 Running 0 13m
    kube-system pod/nginx-ingress-controller-6fc5bcc8c9-5szfh 1/1 Running 0 9m9s
    kube-system pod/storage-provisioner 1/1 Running 0 13m

    Expected behavior


    Normal start of Che as before TLS was made default

    Runtime

    • [ ] kubernetes (include output of kubectl version)
    • [ ] Openshift (include output of oc version)
    • [X] minikube (include output of minikube version and kubectl version)
    • [ ] minishift (include output of minishift version and oc version)
    • [ ] docker-desktop + K8S (include output of docker version and kubectl version)
    • [ ] other: (please specify)

    [root@czprapd-chenext ~]# minikube version
    minikube version: v1.8.1
    commit: cbda04cf6bbe65e987ae52bb393c10099ab62014

    [root@czprapd-chenext ~]# kubectl version
    Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:07:13Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

    Screenshots

    Installation method

    • [X] chectl
    • [ ] che-operator
    • [ ] minishift-addon
    • [ ] I don't know
      chectl server:start --multiuser --platform=minikube

    Environment

    • [ ] my computer

      • [ ] Windows

      • [ ] Linux

      • [ ] macOS

    • [ ] Cloud

      • [ ] Amazon

      • [ ] Azure

      • [ ] GCE

      • [ ] other (please specify)

    • [X] other: please specify
      Custom VM, CentOS 7, minikube on docker

    Eclipse Che Logs


    che.log
    che-operator.log
    che-devfile-registry.log
    keycloak.log
    che-plugin-registry.log
    postgres.log
    events.txt

    Additional context


    Started happening on our automated nigthly deployed machine, which deploys che-next from nightly builds daily.
    I can replicate the same scenario on different machines.

    arechectl kinenhancement kinquestion severitP1

    All 9 comments

    @filipkroupa you missed --self-signed-cert flag while deploying Che (well, I guess you don't have real certificate for minikube).
    Could you please redeploy Che with that flag on?

    @mmorhun Hi, I tried to redeploy with the flag
    chectl server:start --multiuser --platform=minikube --self-signed-cert
    but with little luck. The behavior stays the same, che pod won't get ready. What has changed is the exception I'm getting:

    5) Error injecting constructor, java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.10.48.188.27.nip.io/auth/realms/che/.well-known/openid-configuration
    Caused by: java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.10.48.188.27.nip.io/auth/realms/che/.well-known/openid-configuration
    Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching keycloak-che.10.48.188.27.nip.io found.
    Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching keycloak-che.10.48.188.27.nip.io found.

    Is there a reason why there are special flags necessary since the TLS was made default? No TLS related flag is marked as _required_. The usage for chectl mentions:

    -s, --tls
    Enable TLS encryption.
    Note, that this option is turned on by default for kubernetes infrastructure.
    If it is needed to provide own certificate, 'che-tls' secret with TLS certificate must be
    created in the configured namespace. Otherwise, it will be automatically generated.

    I thought that the auto-generated certificate is always self-signed, am I right?

    @filipkroupa sorry I wasn't clear enough. I supposed that you are following the doc.

    @mmorhun Oh I see now what you mean. I never needed TLS, so I did not know about any prerequisites. I guess I just assumed since TLS was made default it is internally doing everything. So, is there a way to not use TLS? Because I've looked at the guide and this creates a lot of unnecessary overhead, I don't really need https in my dev/test environment.

    @filipkroupa it is still possible, but not recommended, though. One of the reasons is that some Theia functionality doesn't work properly without TLS.
    So, to disable it, you should provide patch to Che custom resource:

    # patch.yaml
    spec:
      server:
        tlsSupport: false
    
    chectl server:start --platform=minikube --multiuser --installer=operator --che-operator-cr-patch-yaml=patch.yaml
    

    P.S. Thank you for pointing out to misleading tls flag description. I'll update it.

    @mmorhun
    In context of this issue it is needed to improve pre-flight chects if case of --tls flag

    @mmorhun I've followed the guide you provided and it works with https for me now, thank you. I have one note regarding the guide: In importing certificates into Chrome browser, provided address chrome://settings/certificates opens just blank page for me and my colleagues. I don't know if this is because of Chrome version or possibly our company-managed browsers. Anyway, we need to import the certificate into _Trusted Root Certification Authorities_ from different settings location: Chrome -> Settings -> Privacy and security -> More -> Manage certificates

    @filipkroupa I've just tried to open chrome://settings/certificates url and it worked for me. However I always get there manually as you described.

    I am marking this as resolved as docs are changed and some TLS configuration checks are added.

    Was this page helpful?
    0 / 5 - 0 ratings