Che: Eclipse Che pod bootstrap timeout on chectl install, when using Che operator with TLS and unsigned certificate on non-OpenShift kube

Created on 6 Mar 2020  ยท  14Comments  ยท  Source: eclipse/che

Describe the bug

When attempting to install Che 7.9.0 on generic Kubernetes, with TLS enabled and a self-signed certificate, using chectl via the Che operator, the Che pod fails to start due to an inability to connect to Keycloak.

chectl server:start --platform=k8s --installer=operator --domain=(cluster ip).nip.io --che-operator-cr-yaml=./codewind-checluster.yaml --che-operator-image=quay.io/eclipse/che-operator:7.9.0 --tls --self-signed-cert

The Che pod appears not to allow connecting to Keycloak via a self-signed certificate.

As per the attached Che pod logs, the che pod is failing to start due to the following exception

Caused by: java.lang.RuntimeException: Exception while retrieving OpenId configuration from endpoint: https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/.well-known/openid-configuration
  at org.eclipse.che.multiuser.keycloak.server.KeycloakSettings.<init>(KeycloakSettings.java:104)
  at org.eclipse.che.multiuser.keycloak.server.KeycloakSettings$$FastClassByGuice$$e0d0786b.newInstance(<generated>)
  at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89)
  (... edit ...)
  at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4699)
  at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5165)
  at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
  at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:743)
  at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:719)
  at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:714)
  at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:970)
  at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1841)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No subject alternative DNS name matching keycloak-che.9.42.80.171.nip.io found.
  at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
  (... edit ...)
  at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
  at java.net.URL.openStream(URL.java:1067)
  at org.eclipse.che.multiuser.keycloak.server.KeycloakSettings.<init>(KeycloakSettings.java:97)
  ... 124 more
Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching keycloak-che.9.42.80.171.nip.io found.
  at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214)
  at sun.security.util.HostnameChecker.match(HostnameChecker.java:96)
  at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:462)
  (... edit ...)
  at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
  ... 138 more

The che pod appears to be attempting to access this URL https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/.well-known/openid-configuration URL, which I am able to successfully access from my browser (albeit behind a self-signed cert browser warning) and curl:

jgw@pulse-orange$ curl https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/.well-known/openid-configuration --insecure

{"issuer":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che","authorization_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/auth","token_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/token","token_introspection_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/token/introspect","userinfo_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/userinfo","end_session_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/logout","jwks_uri":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/certs","check_session_iframe":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/login-status-iframe.html","grant_types_supported":["authorization_code","implicit","refresh_token","password","client_credentials"],"response_types_supported":["code","none","id_token","token","id_token token","code id_token","code token","code id_token token"],"subject_types_supported":["public","pairwise"],"id_token_signing_alg_values_supported":["PS384","ES384","RS384","HS256","HS512","ES256","RS256","HS384","ES512","PS256","PS512","RS512"],"userinfo_signing_alg_values_supported":["PS384","ES384","RS384","HS256","HS512","ES256","RS256","HS384","ES512","PS256","PS512","RS512","none"],"request_object_signing_alg_values_supported":["PS384","ES384","RS384","ES256","RS256","ES512","PS256","PS512","RS512","none"],"response_modes_supported":["query","fragment","form_post"],"registration_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/clients-registrations/openid-connect","token_endpoint_auth_methods_supported":["private_key_jwt","client_secret_basic","client_secret_post","client_secret_jwt"],"token_endpoint_auth_signing_alg_values_supported":["RS256"],"claims_supported":["aud","sub","iss","auth_time","name","given_name","family_name","preferred_username","email"],"claim_types_supported":["normal"],"claims_parameter_supported":false,"scopes_supported":["openid","microprofile-jwt","web-origins","roles","phone","address","email","profile","offline_access"],"request_parameter_supported":true,"request_uri_parameter_supported":true,"code_challenge_methods_supported":["plain","S256"],"tls_client_certificate_bound_access_tokens":true,"introspection_endpoint":"https://keycloak-che.9.42.80.171.nip.io/auth/realms/che/protocol/openid-connect/token/introspect"}

A Helm install against the same cluster, using the following install command, does not exhibit this problem:

chectl server:start --platform=k8s --installer=helm --domain=9.42.80.171.nip.io --multiuser --tls  --self-signed-cert

Che version

7.9.0

Steps to reproduce

  1. Generate Che self-signed certs and create them as secrets in the che namespace:

export CLUSTER_IP=(cluster ip)

CA_CN=eclipse-che-signer
DOMAIN="*.$CLUSTER_IP.nip.io"
OPENSSL_CNF="/usr/lib/ssl/openssl.cnf"

OUT_DIR="`cd ~;pwd`"

openssl genrsa -out rootCA.key 4096

openssl req -x509 \
  -new -nodes \
  -key rootCA.key \
  -sha256 \
  -days 1024 \
  -out rootCA.crt \
  -subj /CN=${CA_CN} \
  -reqexts SAN \
  -extensions SAN \
  -config <(cat ${OPENSSL_CNF} \
      <(printf '[SAN]\nbasicConstraints=critical, CA:TRUE\nkeyUsage=keyCertSign, cRLSign, digitalSignature, keyEncipherment'))

openssl genrsa -out domain.key 2048

openssl req -new -sha256 \
    -key domain.key \
    -subj "/O=EclipseChe/CN=${DOMAIN}" \
    -reqexts SAN \
    -config <(cat ${OPENSSL_CNF} \
        <(printf "\n[SAN]\nsubjectAltName=DNS:${DOMAIN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=keyCertSign, digitalSignature, keyEncipherment\nextendedKeyUsage=serverAuth")) \
    -out domain.csr

openssl x509 \
        -req \
        -sha256 \
        -extfile <(printf "subjectAltName=DNS:${DOMAIN}\nbasicConstraints=critical, CA:FALSE\nkeyUsage=keyCertSign,                       digitalSignature, keyEncipherment\nextendedKeyUsage=serverAuth") \
        -days 365 \
        -in domain.csr \
        -CA rootCA.crt \
        -CAkey rootCA.key \
        -CAcreateserial -out "$OUT_DIR/domain.crt"

cp rootCA.crt "$OUT_DIR/ca.crt"


kubectl create namespace che
kubectl create secret tls che-tls --key=domain.key "--cert=$OUT_DIR/domain.crt" -n che
kubectl create secret generic self-signed-cert "--from-file=$OUT_DIR/ca.crt" -n che
  1. Apply the custom clusterrole, which will be referenced in the next step
kubectl apply -f https://raw.githubusercontent.com/eclipse/codewind-che-plugin/master/setup/install_che/codewind-clusterrole.yaml
  1. Download CheCluster operator resource YAML for use by chectl
wget https://raw.githubusercontent.com/eclipse/codewind-che-plugin/master/setup/install_che/che-operator/codewind-checluster.yaml
  • Edit the file and replace ingressDomain: '' with your ingress domain (eg ` ingressDomain: '9.42.80.171.nip.io')
  1. On a non-OpenShift Kubernetes distribution, attempt to install Che using operator install from chectl, using a self-signed certificate.
chectl server:start --platform=k8s --installer=operator --domain=(cluster ip).nip.io --che-operator-cr-yaml=./codewind-checluster.yaml --che-operator-image=quay.io/eclipse/che-operator:7.9.0 --tls --self-signed-cert

Output

chectl server:start --platform=k8s --installer=operator --domain=9.42.80.171.nip.io --che-operator-cr-yaml=/home/ibmadmin/codewind-checluster.yaml --che-operator-image=quay.io/eclipse/che-operator:7.9.0 --tls --self-signed-cert
  โœ” Verify Kubernetes API...OK
  โœ” ๐Ÿ‘€  Looking for an already existing Eclipse Che instance
    โœ” Verify if Eclipse Che is deployed into namespace "che"...it is not
  โœ” โœˆ๏ธ  Kubernetes preflight checklist
    โœ” Verify if kubectl is installed
    โœ” Verify remote kubernetes status...done.
    โœ” Check Kubernetes version: Found v1.17.3+k3s1.
    โœ” Verify domain is set...set to 9.42.80.171.nip.io.
Eclipse Che logs will be available in '/tmp/chectl-logs/1583510159056'
  โœ” Start following logs
    โœ” Start following Eclipse Che logs...done
    โœ” Start following Postgres logs...done
    โœ” Start following Keycloak logs...done
    โœ” Start following Plugin registry logs...done
    โœ” Start following Devfile registry logs...done
  โœ” Start following events
    โœ” Start following namespace events...done
 โ€บ   Warning: Eclipse Che will be deployed in Multi-User mode as 'operator' installer supports only that mode.
  โœ” ๐Ÿƒโ€  Running the Che Operator
    โœ” Copying operator resources...done.
    โœ” Create Namespace (che)...It already exists.
    โœ” Create ServiceAccount che-operator in namespace che...done.
    โœ” Create Role che-operator in namespace che...done.
    โœ” Create ClusterRole che-operator...done.
    โœ” Create RoleBinding che-operator in namespace che...done.
    โœ” Create ClusterRoleBinding che-operator...done.
    โœ” Create CRD checlusters.org.eclipse.che...done.
    โœ” Waiting 5 seconds for the new Kubernetes resources to get flushed...done.
    โœ” Create deployment che-operator in namespace che...done.
    โœ” Create Eclipse Che Cluster eclipse-che in namespace che...done.
  โฏ โœ…  Post installation checklist
    โœ” PostgreSQL pod bootstrap
      โœ” scheduling...done.
      โœ” downloading images...done.
      โœ” starting...done.
    โœ” Keycloak pod bootstrap
      โœ” scheduling...done.
      โœ” downloading images...done.
      โœ” starting...done.
    โœ” Devfile registry pod bootstrap
      โœ” scheduling...done.
      โœ” downloading images...done.
      โœ” starting...done.
    โœ” Plugin registry pod bootstrap
      โœ” scheduling...done.
      โœ” downloading images...done.
      โœ” starting...done.
    โฏ Eclipse Che pod bootstrap
      โœ” scheduling...done.
      โœ” downloading images...done.
      โœ– starting
        โ†’ ERR_TIMEOUT: Timeout set to pod ready timeout 130000
      Retrieving Eclipse Che Server URL
      Eclipse Che status check
 โ€บ   Error: Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
 โ€บ   Installation failed, check logs in '/tmp/chectl-logs/1583510159056'

See attached logs below.

Expected behavior

Che pod to successfully start after connecting to Keycloak endpoint.

Runtime

Kubernetes:

  • Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3+k3s1", GitCommit:"5b17a175ce333dfb98cb8391afeb1f34219d9275", GitTreeState:"clean", BuildDate:"2020-02-27T07:28:53Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
  • Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3+k3s1", GitCommit:"5b17a175ce333dfb98cb8391afeb1f34219d9275", GitTreeState:"clean", BuildDate:"2020-02-27T07:28:53Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Installation method

chectl -platform=k8s --installer=operator, see above for more info.

Environment

Ubuntu 18.04 LTS server

Eclipse Che Logs

ZIP of /tmp/chectl-logs/1583510159056
chectl-logs.zip

arechectl kinbug severitP1 teadeploy

All 14 comments

@tolusha could you please take a look?

@jgwest It would be useful if you provide details of generated certificate, you can do it into your browser[1] or via open-ssl[2]
1: Click certificate
Screenshot_20200310_112850
2: you can find details here https://serverfault.com/questions/215606/how-do-i-view-the-details-of-a-digital-certificate-cer-file

Hi, any way to fix this issue before the PR is ready?

@eder-santos
We are investigating

The issue has been reproduced for installation on minishift 3.11 using custom-resource.yaml with

      selfSignedCert: true
      tlsSupport: true

https://ci.centos.org/view/Devtools/job/devtools-che-pullrequests-java-selenium-tests/186/consoleFull

@jgwest It would be useful if you provide details of generated certificate, you can do it into your browser[1] or via open-ssl[2]

Sounds like it has been reproduced, but here are example self-signed certs generated by the reproduction steps, if additional information is needed from them: certs.zip

@jgwest I've started investigation of the issue.
First, which is probably a typo, --domain=(cluster ip).nip.io should be --domain=$(cluster ip).nip.io. And if one omits the domain flag Che will try to autodetect it.
Second, the right name for the self signed secret is self-signed-certificate. So the command

kubectl create secret generic self-signed-cert "--from-file=$OUT_DIR/ca.crt" -n che

should be

kubectl create secret generic self-signed-certificate "--from-file=$OUT_DIR/ca.crt" -n che

But it still doesn't help... I continue the investigation.

Thanks @mmorhun, re: self-signed-cert, looks like I used the example from the Che docs: https://www.eclipse.org/che/docs/che-7/setup-che-in-tls-mode-with-self-signed-certificate/#procedure-2

Another thing which I've found is wrong ingress to secret binding. All Che ingresses should have secretName set to che-tls.

@jgwest with all the changes from the PRs above it should work (tested on minikube though).
And don't forget about self-signed-certificate secret name.
P.S. The certificate generated in the steps to reproduce is not accepted by Chrome, but ok for Firefox, curl, openssl

@eder-santos

Hi, any way to fix this issue before the PR is ready?

One may implicitly set k8s.tlsSecretName to che-tls.
And, of course, the self signed certificate secret name should be self-signed-certificate _not_ self-signed-cert.

@themr0c @boczkowska we have mistake in our docs, please see this comment and the one above.

@mmorhun - Looks good, with those two changes I am able to install Che as expected. ๐Ÿ‘

Re: docs, it's this page that still suggests to use self-signed-cert as the secret name: https://github.com/eclipse/che-docs/blame/master/src/main/pages/che-7/contributor-guide/proc_deploy-che-with-self-signed-tls-on-kubernetes.adoc#L41

I am going to create a PR into docs.

I think the problem is resolved, so closing this issue.

Was this page helpful?
0 / 5 - 0 ratings