Che: Cannot start workspace with Openshift storage class mode with WaitForFirstConsumer value

Created on 26 Sep 2019  路  6Comments  路  Source: eclipse/che

Describe the bug

When default StorageClass is configured to have volumeBindingMode set to WaitForFirstConsumer, workspaces do not start.

Che version

  • [ ] latest
  • [x] nightly
  • [ ] other: please specify

Steps to reproduce

  1. Consume a factory by "https://[your path to Che]f?url=https://raw.githubusercontent.com/redhat-developer-demos/guru-night/master/quarkus-tutorial/devfile.yaml url".

Expected behavior

The factory consumed properly with creation a workspace.

Runtime

  • [ ] kubernetes (include output of kubectl version)
  • [x] Openshift 4.1.16 (include output of oc version)
  • [ ] minikube (include output of minikube version and kubectl version)
  • [ ] minishift (include output of minishift version and oc version)
  • [ ] docker-desktop + K8S (include output of docker version and kubectl version)
  • [ ] other: (please specify)

Screenshots

screenshot-Wait_loading_workspace_and_get_time

Installation method

  • [x] chectl
  • [ ] che-operator
  • [ ] minishift-addon
  • [ ] I don't know

Environment

  • [ ] my computer

    • [ ] Windows

    • [ ] Linux

    • [ ] macOS

  • [x] Cloud

    • [x] Amazon

    • [ ] Azure

    • [ ] GCE

    • [ ] other (please specify)

  • [ ] other: please specify

Additional context

Codeready issue - https://issues.jboss.org/browse/CRW-294.

kinbug severitP1 teaplatform

All 6 comments

Just to extend context a bit: previously it used to work, but after some upgrading of OpenShift version, OS starts to propagate events with FailedScheduling if WaitForFirstConsumer is used. The second workspace start - works just fine and If Che would wait a couple of seconds - PVC would be mounted successfully.
So, the easiest solution here is to remove FailedScheduling from the default unrecoverable events list.
It was initially added for some issue on OSIO, so if we just remove it - some workspaces may exceed workspace timeout instead of early failure.
Maybe OSIO case should be investigated more and event message should be concreted instead of catching all FailedScheduling, like FailedScheduling: no available node is found(<-just an example)

cc @ibuziuk

@SkorikSergey not quite sure how to classify this: is "waitForFirstUser" something we need? Or can we just declare that an invalid configuration for Che?

@tsmaeder: WaitForFirstConsumer is the mode of StorageClass of OpenShift, not Che itself.
Che can only work around such a mode - don't fail the start of workspace immediately, but wait for PV be ready to use after initial failure of PVC mount.

@dmytro-ndp that is not the question: can we document that "WaitForFirstConsumer" is not a supported setting for running Che, or do we need to support this setting?

"WaitForFirstConsumer" is default setting of OpenShift 4.x on AWS. Looks like it's usual mode for instances on AWS, Azure, ... So we need to support it.

Can not reproduce on OpenShift 4.2. Seems there was an issue in OpenShift4.0, OpenShift4.1. Closing this issue as not actual.
Feel free to reopen it or create a new one if the same behavior happens on OpenShift greater than 4.2

Was this page helpful?
0 / 5 - 0 ratings