Che: Installing Che on Azure does not work, because of invalid secret

Created on 21 Apr 2020  Β·  25Comments  Β·  Source: eclipse/che

Summary

Installing Che on Azure does not work, because _secret is found but invalid_.

Relevant information

Hi, I am new here and trying to install Che on an Azure (free trial) instance because I would like to test that for the usage within OpenADx.
I am using a Mac.
I started also a discussion on Mattermost and got the hint to open an issue.

What I did?

I followed the installation manual step-by-step.

The last step won't work for me and I have no idea, what I am doing wrong ...
chectl server:start --installer=helm --platform=k8s --domain=azr.my-ide.cloud --multiuser

Here is the Mac Terminal output:
chectl server:start --installer=helm --platform=k8s --domain=azr.my-ide.cloud --multiuser
Set current context to 'eclipse-che'
βœ” Verify Kubernetes API...OK
βœ” πŸ‘€ Looking for an already existing Eclipse Che instance
βœ” Verify if Eclipse Che is deployed into namespace "che"...it is not
βœ” ✈️ Kubernetes preflight checklist
βœ” Verify if kubectl is installed
βœ” Verify remote kubernetes status...done.
βœ” Check Kubernetes version: Found v1.15.10.
βœ” Verify domain is set...set to azr.my-ide.cloud.
↓ Check if cluster accessible [skipped]
Eclipse Che logs will be available in '/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587286102461'
βœ” Start following logs
↓ Start following Operator logs [skipped]
βœ” Start following Eclipse Che logs...done
βœ” Start following Postgres logs...done
βœ” Start following Keycloak logs...done
βœ” Start following Plugin registry logs...done
βœ” Start following Devfile registry logs...done
βœ” Start following events
βœ” Start following namespace events...done
❯ πŸƒβ€ Running Helm to install Eclipse Che
βœ” Verify if helm is installed
βœ” Check Helm Version: Found v3.1.2+gd878d4d
βœ” Create Namespace (che)...does already exist.
βœ– Check Eclipse Che TLS certificate
β†’ "che-tls" secret is found but it is invalid. The valid self-signed certificate should contain "tls.crt"
…
Check Cluster Role Binding
Preparing Eclipse Che Helm Chart
Updating Helm Chart dependencies
Deploying Eclipse Che Helm Chart
β€Ί Error: Error: "che-tls" secret is found but it is invalid. The valid self-signed certificate should
β€Ί contain "tls.crt", "tls.key" and "ca.crt" entries.
β€Ί Installation failed, check logs in
β€Ί '/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587286102461'

After hint from the chat, I deleted all resources on Azure and did the same again ...

The same ...

Do you have an idea?

BTW, I tried also (more or less during a failed trial) to install Che in TLS mode, described in this manual.

Thanks a lot!

areinstall kinbug severitP1

All 25 comments

@tolusha @mmorhun can you take a look?

We have inconsistency between doc the and the installation.
The doc describe steps how to deploy cert-manager and how to create secret, but chectl with helm installer deploys cert-manager also and expect a specific che-tls secret.

@ariexi
Could you try to update to the latest stable version of chectl:
chectl update stable
and:
chectl server:start --installer=operator --platform=k8s --domain=azr.my-ide.cloud

@mmorhun wdyt?

Yes, I think it should work with operator installer.

@tolusha @mmorhun Thanks!
I tried that, update and server start and got following response:
chectl server:start --installer=operator --platform=k8s --domain=azr.my-ide.cloud
βœ” Verify Kubernetes API...OK
βœ” πŸ‘€ Looking for an already existing Eclipse Che instance
βœ” Verify if Eclipse Che is deployed into namespace "che"...it is not
❯ ✈️ Kubernetes preflight checklist
βœ” Verify if kubectl is installed
βœ” Verify remote kubernetes status...done.
βœ” Check Kubernetes version: Found v1.15.10.
βœ” Verify domain is set...set to azr.my-ide.cloud.
βœ– Check if cluster accessible
β†’ Cannot reach cluster at "azr.my-ide.cloud". To skip this check add "--skip-cluster-availability-check"
…
β€Ί Error: Error: Cannot reach cluster at "azr.my-ide.cloud". To skip this check add
β€Ί "--skip-cluster-availability-check" flag.
β€Ί Installation failed, check logs in
β€Ί '/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587582505596'

... then I tried
chectl server:start --installer=operator --platform=k8s --domain=azr.my-ide.cloud --skip-cluster-availability-check
βœ” Verify Kubernetes API...OK
βœ” πŸ‘€ Looking for an already existing Eclipse Che instance
βœ” Verify if Eclipse Che is deployed into namespace "che"...it is not
βœ” ✈️ Kubernetes preflight checklist
βœ” Verify if kubectl is installed
βœ” Verify remote kubernetes status...done.
βœ” Check Kubernetes version: Found v1.15.10.
βœ” Verify domain is set...set to azr.my-ide.cloud.
↓ Check if cluster accessible [skipped]
Eclipse Che logs will be available in '/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587582624906'
βœ” Start following logs
βœ” Start following Operator logs...done
βœ” Start following Eclipse Che logs...done
βœ” Start following Postgres logs...done
βœ” Start following Keycloak logs...done
βœ” Start following Plugin registry logs...done
βœ” Start following Devfile registry logs...done
βœ” Start following events
βœ” Start following namespace events...done
❯ πŸƒβ€ Running the Eclipse Che operator
βœ” Copying operator resources...done.
βœ” Create Namespace (che)...It already exists.
βœ” πŸƒβ€ Running the Eclipse Che operator
βœ” Copying operator resources...done.
βœ” Create Namespace (che)...It already exists.
βœ” Checking for pre-created TLS secret... "che-tls" secret found
βœ” Checking certificate
βœ” Create ServiceAccount che-operator in namespace che...done.
βœ” Create Role che-operator in namespace che...done.
βœ” Create ClusterRole che-operator...done.
βœ” Create RoleBinding che-operator in namespace che...done.
βœ” Create ClusterRoleBinding che-operator...done.
βœ” Create CRD checlusters.org.eclipse.che...done.
βœ” Waiting 5 seconds for the new Kubernetes resources to get flushed...done.
βœ” Create deployment che-operator in namespace che...done.
βœ” Create Eclipse Che cluster eclipse-che in namespace che...done.
❯ βœ… Post installation checklist
βœ” PostgreSQL pod bootstrap
βœ” scheduling...done.
βœ” downloading images...done.
βœ” starting...done.
βœ” Keycloak pod bootstrap
βœ” scheduling...done.
βœ” downloading images...done.
βœ” starting...done.
βœ” Devfile registry pod bootstrap
βœ” scheduling...done.
βœ” downloading images...done.
βœ” starting...done.
βœ” Plugin registry pod bootstrap
βœ” scheduling...done.
βœ” downloading images...done.
βœ” starting...done.
❯ Eclipse Che pod bootstrap
βœ” scheduling...done.
βœ” downloading images...done.
βœ– starting
β†’ ERR_TIMEOUT: Timeout set to pod ready timeout 130000
Retrieving Eclipse Che server URL
Eclipse Che status check
β€Ί Error: Error: ERR_TIMEOUT: Timeout set to pod ready timeout 130000
β€Ί Installation failed, check logs in
β€Ί '/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587582624906'

It looks better, but still there is a problem...
Do you have ideas?
Thx!
Best,
Andy

kubectl get ingress -n che
Pls attach che-server logs.

Here is the result of kubectl get ingress -n che

NAME               HOSTS                                   ADDRESS         PORTS     AGE
che                che-che.azr.my-ide.cloud                xx.xxx.xxx.xx   80, 443   13h
devfile-registry   devfile-registry-che.azr.my-ide.cloud   xx.xxx.xxx.xx   80, 443   13h
keycloak           keycloak-che.azr.my-ide.cloud           xx.xxx.xxx.xx   80, 443   13h
plugin-registry    plugin-registry-che.azr.my-ide.cloud    xx.xxx.xxx.xx   80, 443   13h

I x-ed the address out ...

Sorry, I am a really newbie and did not find the server logs.
Could you please help me, where to find it?
Thx!

@ariexi on deploy error, chectl prints something like: Installation failed, check logs in: <dir>. Please zip it and attach to the issue.

@mmorhun Thank you and sorry for bothering you.
I guess you mean the dir
/var/folders/sw/b2n1zkm5093dg7x17hsqcfnr0000gn/T/chectl-logs/1587582624906 which is printed.
My problem is, I did not found it on my computer and I do not know, where to find ...
Do have a hint for me?
Thx!

I am also facing similar issue with k8s platform
Its facing at post installation step
I checked POD and found that PV is not present and PVC is unable to bound and its only reason for failing
Do I need create PV here ?

root@osboxes:~# chectl server:start --platform=k8s --installer=operator --domain=${CHE_DOMAIN}.nip.io --self-signed-cert --cheimage=eclipse/che-server:6.13.0
βœ” Verify Kubernetes API...OK
βœ” πŸ‘€ Looking for an already existing Eclipse Che instance
βœ” Verify if Eclipse Che is deployed into namespace "che"...it is not
βœ” ✈️ Kubernetes preflight checklist
βœ” Verify if kubectl is installed
βœ” Verify remote kubernetes status...done.
βœ” Check Kubernetes version: Found v1.18.1.
βœ” Verify domain is set...set to 10.103.253.119.nip.io.
βœ” Check if cluster accessible... ok
Eclipse Che logs will be available in '/tmp/chectl-logs/1587649892261'
βœ” Start following logs
βœ” Start following Operator logs...done
βœ” Start following Eclipse Che logs...done
βœ” Start following Postgres logs...done
βœ” Start following Keycloak logs...done
βœ” Start following Plugin registry logs...done
βœ” Start following Devfile registry logs...done
βœ” Start following events
βœ” Start following namespace events...done
βœ” πŸƒβ€ Running the Eclipse Che operator
βœ” Copying operator resources...done.
βœ” Create Namespace (che)...It already exists.
βœ” Checking for pre-created TLS secret... "che-tls" secret found
↓ Checking certificate [skipped]
βœ” Create ServiceAccount che-operator in namespace che...It already exists.
βœ” Create Role che-operator in namespace che...It already exists.
βœ” Create ClusterRole che-operator...It already exists.
βœ” Create RoleBinding che-operator in namespace che...It already exists.
βœ” Create ClusterRoleBinding che-operator...It already exists.
βœ” Create CRD checlusters.org.eclipse.che...It already exists.
βœ” Waiting 5 seconds for the new Kubernetes resources to get flushed...done.
βœ” Create deployment che-operator in namespace che...It already exists.
βœ” Create Eclipse Che cluster eclipse-che in namespace che...It already exists.
❯ βœ… Post installation checklist
❯ Eclipse Che pod bootstrap
βœ– scheduling
β†’ ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined
downloading images
starting
Retrieving Eclipse Che server URL
Eclipse Che status check
β€Ί Error: Error: ERR_TIMEOUT: Timeout set to pod wait timeout 300000. podExist: false, currentPhase: undefined

β€Ί Installation failed, check logs in '/tmp/chectl-logs/1587649892261'

root@osboxes:/tmp/chectl-logs/1587649892261/che# cat events.txt
LAST SEEN TYPE REASON OBJECT MESSAGE
31m Warning FailedScheduling pod/postgres-5487c96577-cbn54 running "VolumeBinding" filter plugin for pod "postgres-5487c96577-cbn54": pod has unbound immediate PersistentVolumeClaims
5s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims
67s Normal SuccessfulCreate replicaset/postgres-5659c44979 Created pod: postgres-5659c44979-4kjds
2m5s Warning FailedScheduling pod/postgres-6fc887f4b9-gwz76 running "VolumeBinding" filter plugin for pod "postgres-6fc887f4b9-gwz76": pod has unbound immediate PersistentVolumeClaims
77s Warning FailedScheduling pod/postgres-6fc887f4b9-gwz76 skip schedule deleting pod: che/postgres-6fc887f4b9-gwz76
30m Normal SuccessfulCreate replicaset/postgres-6fc887f4b9 Created pod: postgres-6fc887f4b9-gwz76
3m32s Normal FailedBinding persistentvolumeclaim/postgres-data no persistent volumes available for this claim and no storage class is set
30m Normal ScalingReplicaSet deployment/postgres Scaled up replica set postgres-6fc887f4b9 to 1
67s Normal ScalingReplicaSet deployment/postgres Scaled up replica set postgres-5659c44979 to 1
1s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims
0s Normal FailedBinding persistentvolumeclaim/postgres-data no persistent volumes available for this claim and no storage class is set
1s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims
1s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims
0s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims
0s Normal FailedBinding persistentvolumeclaim/postgres-data no persistent volumes available for this claim and no storage class is set
0s Warning FailedScheduling pod/postgres-5659c44979-4kjds running "VolumeBinding" filter plugin for pod "postgres-5659c44979-4kjds": pod has unbound immediate PersistentVolumeClaims

Awaiting for solution or any Idea to fix it

cc @tolusha as the author of the logs collecting mechanism. BTW, I've seen such problem before, but it is hard to reproduce.

@ariexi thank you for reporting. So to look at logs you may use kubectl, for example (I suppose defaults in Che deployments):

# Get Che server pod name
kubectl get pods -n che
# Line like following should be present in the output:
# NAME                                READY   STATUS      RESTARTS   AGE
# che-c64d8bbfb-fjnp9                 1/1     Running     0          3m2s

# Then read logs:
kubectl logs <pod-name> -n che
# For example above: kubectl logs che-c64d8bbfb-fjnp9 -n che

But I would suggest you to just start over (just for case if something is get messed up by different installers):

  • delete che namespace (or whatever you specified) with kubectl delete namespace che
  • pre-create secrets as docs says
  • try to update chectl
  • deploy Che server:
chectl server:start --installer=operator --platform=k8s --domain=<your domain> 

If that fails too, please provide more information (the logs should be present).

@Roshani30 you shouldn't create PV or PVC manually. It should be done by installer.
There is similar issue with minikube 1.9.x (1.8.2 works fine), so maybe this is also affects Kubernetes as well, but this is only my guessing.

@mmorhun Please find attached the log file for the first part (w/o deleting the namespace)
log.txt

After following your instructions it behaves the same. Please find attached the log file.
che-log.txt

Dumb question, what about the URL? Do I need a separate one? Currently I use the default (azr.my-ide.cloud) which is also displayed on the Azure Web UI.

Thank you so much for your help!

Best,
Andy

@Mykola Morhun So I also follow steps again?
because for me its failing at below stage
Post installation checklist
❯ Eclipse Che pod bootstrap
βœ– scheduling
delete che namespace (or whatever you specified) with kubectl delete namespace che
pre-create secrets as docs says
try to update chectl
deploy Che server:
chectl server:start --installer=operator --platform=k8s --domain=

@ariexi thank you for update.
Unfortunately I cannot say for sure what is wrong. I suspect that Che server cannot see Keycloak (cannot resolve its host). Can you try to reach your Keycloak from your browser (the link could be found in logs, say https://keycloak-che.azr.my-ide.cloud/auth/realms/che/.well-known/openid-configuration) ? If not, something is wrong with configuration.

Dumb question, what about the URL? Do I need a separate one? Currently I use the default (azr.my-ide.cloud) which is also displayed on the Azure Web UI.

I am not sure that I understand you question correctly, but if you are talking about Che server URL, it should be printed by chectl at the end of installation process. Usually it looks like che-che.domain.com, but it may vary depending on the settings.

@ariexi @Roshani30
I am wondering if CHE_SELF__SIGNED__CERT is set:
kubectl get deployment che -n che -o=yaml

if no, let's reinstall it with adding --self-signed-cert to chectl

@tolusha
I guess, it is installed ...
Please find here the output of kubectl get deployment che -n che -o=yaml
che-log-2020-04-24.txt

@mmorhun I try to describe it ...
You mentioned che-che.domain.com, I use for domain -> azr.my-ide.cloud, that is what is described in the manual. I use exactly this URL.
I can see in Azure Web UI in resource group eclipseCheResourceGroup the DNS zone azr.my-ide.cloud.
Do I have to allocate this URL somewhere else? E.g. on a separate server? Or should it work like this? Or do I have to allocate another URL, e.g. www.my-che-ide.de

With the command kubectl get ingress -n che I got the result
keycloak keycloak-che.azr.my-ide.cloud with an IP dress, but if use keycloak-che.azr.my-ide.cloudas URL in the browser, I cannot reach the server ...

Thank you for your help!

@ariexi

You mentioned che-che.domain.com, I use for domain -> azr.my-ide.cloud, that is what is described in the manual. I use exactly this URL.

I didn't get your question right. The domain parameter should specify public domain of your Kubernetes cluster. That's it. Che server URL (like che-che.domain.com) will be generated automatically (a new ingress should be created).

With the command kubectl get ingress -n che I got the result
keycloak keycloak-che.azr.my-ide.cloud with an IP dress, but if use keycloak-che.azr.my-ide.cloudas URL in the browser, I cannot reach the server ...

This is the answer or rather the problem which prevents Che from starting. Keycloak should be accessible... When Che server is starting it needs to reach keycloak (and it tries to do it by external URL), but fails (as you with browser). Looks like DNS/domain settings issue. It should resolve subdomains as well as the parent domain (i.e. both azr.my-ide.cloud and something.azr.my-ide.cloud should point to the cluster IP).
@sleshchenko has experience with Azure, maybe he can give an piece of advice.

@mmorhun @sleshchenko What I also just saw, in the installation manual there is mentioned, that there should be A new DNS challenge is added to the DNS zone for Let’s Encrypt.
With Azure Web UI I cannot find it...

If you use Let's Encrypt, you shouldn't specify --self-signed-cert for chectl

@ariexi If you mean _acme_challenge one, it is removed as soon as cert-manager gets certificate.
To check actual state, please check corresponding secrets or cert-manafer logs (informative if it failed due to some reason).

@mmorhun, yes, I meant _acme_challenge...

One thing I am struggling with is the URL, in my example azr.my-ide.cloud
As I am not sure about that, will Azure be responsible for resolving this URL?
In other words, when using azr.my-ide.cloud as URL in my browser (new tab) currently there is nothing happening, I got the message "server not reachable" only.
Will Azure be responsible for that? Or did I miss something? (e.g. use an URL which belongs to me)

@ariexi I am not an Azure expert, so cannot tell you this.
What you need to run Eclispe Che is correctly set up Kubernetes or Openshift cluster. In terms of cluster domain url, it should be resolvable as well as its subdomains (so subdomain.<cluster-url> also points to your cluster - needed for exposing different services, including your workspaces). You should setup DNS and TLS correctly. Will it be done on Azure side or third party providers - doesn't matter. The thing is it must work in order to run Eclipse Che.

From my point of view, we can close this issue for now.
Thx for your support!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

LaneGeek picture LaneGeek  Β·  3Comments

AndrienkoAleksandr picture AndrienkoAleksandr  Β·  3Comments

skabashnyuk picture skabashnyuk  Β·  3Comments

vanzhiganov picture vanzhiganov  Β·  3Comments

sleshchenko picture sleshchenko  Β·  3Comments