Che: Restrict the installation of Eclipse Che only to the 'eclipse-che' namespace

Created on 18 Jun 2020  ·  33Comments  ·  Source: eclipse/che

Is your task related to a problem? Please describe.

The only namespace for Eclipse Che installation should be 'eclipse-che'. It should be prohibited to install Eclipse Che to any other namespace.

Describe the solution you'd like

When Eclipse Che is installed via Operatorhub it should be possible to install it only in the 'eclipse-che' namespace

List of subtasks:

Describe alternatives you've considered

Continue allowing namespace selection during the installation

Additional context

Pros:

  • there is no particular value for namespace selection from the admin perspective.
  • the installation process would be more predictable
  • avoiding multiple instances of Che in the same cluster (to avoid conflicts and waste of resources - it will be allowed to have only 1 Eclipse Che per cluster)
  • avoiding the common deployment failure when it gets installed in the default namespace
  • To retrieve operand metrics when installed on OpenShift (will not be possible for Eclipse Che due to the openshift-* namespace restriction - https://github.com/openshift/enhancements/blob/master/enhancements/olm/olm-managed-operator-metrics.md#fulfilling-namespace-and-rbac-requirements)

Cons:

  • it will be allowed to have only 1 Eclipse Che per cluster - the following flow will not be possible - https://github.com/eclipse/che/issues/17187#issuecomment-657588515
  • we need to tackle upgrades:

    • if Eclipse Che was initially installed NOT in the 'eclipse-che' namespace after the upgrade it should be in the 'eclipse-che' (previous namespace should be deleted)

    • if there are multiple Eclipse Che installed on the cluster upgrades should fail with the clear error message.

areinstall kinepic kintask severitP1 statuanalyzing teahosted-che

All 33 comments

@sleshchenko could you please clarify how do you do the trick with the installation of the web terminal operator to the concrete namespace?

I trust there will be a way to override (set the specific namespace) and disable (not restrict) so that we can:

  • set a different default for CRW vs. Che
  • disable telemetry for our own tests (we're not the customer, we're just testing internally)
  • allow installing multiple instances (again for testing since we don't have an unlimited # of clusters into which we can install ONLY one CRW instance)

WDYT?

@ibuziuk the web terminal operator is using the AllNamespace install mode. This makes it a global operator that will be installed by OperatorHub in the openshift-operators namespace by default.

cf this doc: https://docs.openshift.com/container-platform/4.4/operators/olm-adding-operators-to-cluster.html#olm-installing-from-operatorhub-using-web-console_olm-adding-operators-to-a-cluster

Mainly the following quote:

All namespaces on the cluster (default) installs the Operator in the default openshift-operators namespace to watch and be made available to all namespaces in the cluster. This option is not always available.

@ibuziuk something I was discussing this morning with @davidfestal: we may use AllNamespaces for Che as well and then deploy wsmaster in eclipse-che namespace.

@nickboldt has raised some interesting points. I don't think we should make the namespace configurable though:

  • set a different default for CRW vs. Che

This can be a build option but not a runtime option (i.e. users/customers should not change that).

  • disable telemetry for our own tests (we're not the customer, we're just testing internally)

I think the recommendation here is to receive telemetry even for our own tests. They can be useful, we need to verify that sending metrics works fine and they are easy to filter out. cc @spaparaju

  • allow installing multiple instances (again for testing since we don't have an unlimited # of clusters into which we can install ONLY one CRW instance)

If would discourage this: why do we want to tests scenarios that nobody will ever test? To find, analyse and fix bugs that nobody will ever find? We should not support/test multiple instances of Che on the same Kubernetes cluster. cc @rhopp

If would discourage this: why do we want to tests scenarios that nobody will ever test? To find, analyse and fix bugs that nobody will ever find? We should not support/test multiple instances of Che on the same Kubernetes cluster. cc @rhopp

I would love to avoid running multiple instances on the same cluster, but as Nick said, we don't have enough resources for doing so. If this will become hard-restriction (now it's soft restriction in sense it is possible to install multiple instances on single cluster, but it's discouraged or officially not supported) it will be quite a hurdle for QE to overcome.

I would love to avoid running multiple instances on the same cluster, but as Nick said, we don't have enough resources for doing so. If this will become hard-restriction (now it's soft restriction in sense it is possible to install multiple instances on single cluster, but it's discouraged or officially not supported) it will be quite a hurdle for QE to overcome.

If we want user to restrict users to one namespace and one instance of Che per cluster we should enforce this restriction when testing as well. @rhopp is this an upstream or downstream tests constraint? Is there an issue that describes the problem (i.e. in what circumstances we need to run multiple instances of Che and what options have been considered)?

Sorry, the Pros aren't convincing enough.

@tolusha could you clarify a clear use-case (with configuration) for having 2+ manageable instancies of Eclipse Che on the same cluster (instances for testing are not taken into account)?

@ibuziuk
I don't have any use-cases except for testing purpose.
For this point of view it make sense to me

@tolusha could you clarify a clear use-case (with configuration) for having 2+ manageable instancies of Eclipse Che on the same cluster (instances for testing are not taken into account)?

Che-server development.

@skabashnyuk Che-server development e.g. usage of the che-dev OSD v4 Cluster case?

@skabashnyuk Che-server development e.g. usage of the che-dev OSD v4 Cluster case?

I'm sorry. I didn't understand your question.

@skabashnyuk could you clarify your point about Che-server development in regards to the fixed namespace enforcement?

@ibuziuk you probably right. This topic is about che + operator. che-server development does not require it.

@ibuziuk you probably right. This topic is about che + operator. che-server development does not require it.

Right: we want to avoid that a user can select any namespace when deploying Che using the operator or chectl. But the che-server should be able to run in any namespace. It should NOT give for granted that it will run in eclipse-che namespace.

There are at least 3 reasons why we want deployments to happen in one fixed namespace:

  • To avoid multiple instances of Che in the same cluster (to avoid conflicts and waste of resources)
  • To avoid the common deployment failure when it gets installed in default namespace
  • To retrieve operand metrics when installed on OpenShift

@ibuziuk this is a change that needs to be communicated on different channels (che-dev, Red Hat internal mailing lists etc...) before it becomes effective.

@ibuziuk this is a change that needs to be communicated on different channels (che-dev, Red Hat internal mailing lists etc...) before it becomes effective.

@l0rd got it. As the initial step we are going to define ''eclipse-che" as the suggested namespace on the operator / olm end:

image

This should target the second bullet (avoid the common deployment failure when it gets installed in default namespace) and make the overall UX cleaner and more predictable
PR https://github.com/eclipse/che-operator/pull/328

@ibuziuk nice!

More reasons to reconsider this as a new standard / use cases for more than one CRW install on the same cluster:

  • prototyping (trying different settings/configs in parallel, eg., with and without oauth, or with/out external SSO/postgres)
  • cautious migration (eg., 2.1 and 2.2 in parallel to verify nothing explodes or regresses in the new release)
  • new customer experiments
  • QE testing (we only have a small set of available OCP instances – one per version)
  • OCP and CRW load testing (how many CRW instances can we run on the same cluster before it consumes all the resources and crashes?) – this is handy for validating new architectures too, such as Z and Power

https://issues.redhat.com/browse/CRW-467?focusedCommentId=14209983&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14209983

Also,

is this an upstream or downstream tests constraint?

It's both -- we use the 3 QE OCP instances for testing Che and CRW alike.

Moving to milestone 7.19 so we have time to announce deprecation of old behaviour and adapt to this breaking change.

@nickboldt if there are no blockers/showstoppers during the implementation we are going to enable this for 7.18.0

Another use case suggested by Dom on the Community Call:

  • one cluster contains BOTH a stable deployment (eg., 7.14.x), and
  • a test deployment (7.16.0-SNAPSHOT)

thanks @nickboldt 👍 , can we please list the cons as well to this if possible?

@SDAdham added to the description. Could you clarify the use-case with stable and snapshot (nightly?) deployment a bit more.
How the installation is managed and updated? How the test deployments is used? What config is used for this setup

@nickboldt @SDAdham @ibuziuk we should discourage scenarios with 2 versions of CRW (stable and test) installed via the operator on the same cluster because both will share the same CheCluster CRD and that may have some disruptive side effects on the stable instance.

We should continue supporting multiple instances of Che on the same cluster using helm though.

@l0rd just to clarify by helm support of multiple instances, do you mean chectl + helm installer?

@SDAdham added to the description. Could you clarify the use-case with stable and snapshot (nightly?) deployment a bit more.
How the installation is managed and updated? How the test deployments is used? What config is used for this setup

I don't see any reason to limit the installations to one instance per cluster. I'll speak for K8S deployments as that's what I'm using che on, but I'm sure the concept should remain the same for openshift, etc...

Use case: Running prod and dev/test environment of Che on the same cluster.

In regards to how-tos, I'm not part of the architecture of che to know the best approach, but I can imagine:

  • Che is deployed per namespace.
  • Master node of Che which is on for example Che namespace should be responsible of following up and taking care of the rest of it's own workspaces by managing their namespaces.
  • Managed deployment like chectl should be keeping a record of Master node (Che instance) as it already does afaik, and the master node should be taking care of it's workspaces created by users.

If on K8S, a namespace can't make changes to another namespaces (i.e master node che running in namespace che cannot make changes to workspaces running in xyz-che namespaces. Then chectl should request for the namespaces to upgrade via the master node

Any custom installation other than managed installation like chectl should be not be supported and not taken care into the consideration for the auto-update procedures.

@nickboldt @SDAdham @ibuziuk we should discourage scenarios with 2 versions of CRW (stable and test) installed via the operator on the same cluster because both will share the same CheCluster CRD and that may have some disruptive side effects on the stable instance.

What "disruptive side effects" can happen as a result of having multiple instances?

@SDAdham fields of the CheCluster CRD can be added/removed/updated during an update and that may make one of your 2 CheCluster CR incompatible with the new CRD. In other words, you can isolate your Che instances in 2 different namespaces, but the operator Custom Resource Definition is defined at cluster level: if you update it both Che instances will be affected.

@ibuziuk yes helm charts installed via chectl: no CRD, no cluster privileges required, configurable namespace

+1 for @nickboldt ideas about the need for more than one instance per cluster.
Users with a running CRW instance will often want to 'look at' the upgrade version before adapting it. (Testing the new version, User Acceptance period, etc.) Users would not be happy having to blindly trust us with the new version-- they are going to want a way to install a second CRW/Che version beside the first.
It should be noted that OCP clusters are expensive, some users have only 1 at their disposal. (Also, some users are restricted to just one cluster because of authentication mechanisms, etc.)

@l0rd @ibuziuk Is there any chance there will be some "backdoor" option to select different namespace as per Nick's and Rick's suggestions? (It would greatly help QE as well)

As discussed a few weeks back with @RickJWagner a good compromise for users that are cautious about updates is to implement operand rollback in case of unsuccessful upgrade. Hence this issue is blocked by #18043.

I don't see any explicit mention of rollback in https://github.com/eclipse/che/issues/18043

Will that be implemented as a new chectl command too? Seems like it might be useful & logical to have UI features implemented as CLI features too. cc: @tolusha

Was this page helpful?
0 / 5 - 0 ratings

Related issues

luckymore0520 picture luckymore0520  ·  3Comments

redeagle84 picture redeagle84  ·  3Comments

InterestedInTechAndCake picture InterestedInTechAndCake  ·  3Comments

sudheerherle picture sudheerherle  ·  3Comments

JamesDrummond picture JamesDrummond  ·  3Comments