Cloud-on-k8s: Reconciliation is blocked when a Pod can't be created

Created on 13 Dec 2019  路  5Comments  路  Source: elastic/cloud-on-k8s

I got a situation where:

  1. Pod could not be created because of some invalid values in the spec
  2. ok, err := d.expectationsSatisfied() always returns false
  3. It is impossible for the user to recover from that situation

The following manifest helps to recreate the issue:

apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
  name: blocked
spec:
  version: 7.5.0
  nodeSets:
    - name: default
      count: 1
      config:
        node.store.allow_mmap: false
      podTemplate:
        spec:
          volumes:
            - name: BaDVolumeName
              emptyDir: {}

The problem is that we assume that a Pod is at least in a Pending state, which is not the case.

>bug

Most helpful comment

I hit this one again while playing with topologySpreadConstraints on 1.1.0-rc3 and setting maxSkew to 0

default     6m3s        Warning   FailedCreate             statefulset/elasticsearch-sample-es-default   create Pod elasticsearch-sample-es-default-0 in StatefulSet elasticsearch-sample-es-default failed error: Pod "elasticsearch-sample-es-default-0" is invalid: spec.topologySpreadConstraints[0].maxSkew: Invalid value: 0: must be greater than zero

I was wondering if we can use the "dry run" feature, it is not available until K8S 1.13 but it returns an HTTP 400 error if it is not supported yet:

func validatePod(c k8s.Client, expected appsv1.StatefulSet) func() error {
    // Create a dummy Pod with the pod template
    return func() error {
        dummyPod := &v1.Pod{
            ObjectMeta: metav1.ObjectMeta{
                Namespace:   expected.Namespace,
                Name:        expected.Name + "-dummy-" + rand.String(5),
                Labels:      expected.Spec.Template.Labels,
                Annotations: expected.Spec.Template.Annotations,
            },
            Spec: expected.Spec.Template.Spec,
        }
        // Dry run is beta and available since Kubernetes 1.13
        if err := c.Create(dummyPod, client.DryRunAll); err != nil {
            // Openshift 3.11 and K8S 1.12 don't support dryRun but gently returns "400 - BadRequest" in that case
            if errors.ReasonForError(err) == metav1.StatusReasonBadRequest {
                return nil

            }
            // If the Pod spec is invalid the expected error is 422 - UNPROCESSABLE ENTITY
            // But for K8S >= 1.13 we should not have an error here
            return fmt.Errorf("error while validating Pod for %s/%s: %v", expected.Namespace, expected.Name, err)
        }
        return nil
    }
}

This could be done in the PreCreate or PreUpdate of the StatefulSets reconciliation _(i.e. not during every reconcile loop)_ and avoid falling into this situation.

All 5 comments

Oh right... good catch. We don't create Pods ourselves anymore so we don't get a Pod creation error :(
And the error itself only appears in the StatefulSet events. I don't think we want to monitor those?

A way out of the above is for the user to manually delete the StatefulSet... but this is not a good idea if some good Pods already exist for it.
We could also set a timeout on our expectations (that's what k8s deployment expectations do, just in case) so we eventually reconcile when that happens?

I don't have any other idea so far.

Happened with https://github.com/elastic/cloud-on-k8s/issues/2854. If you set invalid resource requirements (eg. cpu requests > cpu limits), the corresponding Pod can never be created, and ECK waits forever for that Pod to appear in expectations. Even though you may have fixed the requirements in the Elasticsearch manifest.

I hit this one again while playing with topologySpreadConstraints on 1.1.0-rc3 and setting maxSkew to 0

default     6m3s        Warning   FailedCreate             statefulset/elasticsearch-sample-es-default   create Pod elasticsearch-sample-es-default-0 in StatefulSet elasticsearch-sample-es-default failed error: Pod "elasticsearch-sample-es-default-0" is invalid: spec.topologySpreadConstraints[0].maxSkew: Invalid value: 0: must be greater than zero

I was wondering if we can use the "dry run" feature, it is not available until K8S 1.13 but it returns an HTTP 400 error if it is not supported yet:

func validatePod(c k8s.Client, expected appsv1.StatefulSet) func() error {
    // Create a dummy Pod with the pod template
    return func() error {
        dummyPod := &v1.Pod{
            ObjectMeta: metav1.ObjectMeta{
                Namespace:   expected.Namespace,
                Name:        expected.Name + "-dummy-" + rand.String(5),
                Labels:      expected.Spec.Template.Labels,
                Annotations: expected.Spec.Template.Annotations,
            },
            Spec: expected.Spec.Template.Spec,
        }
        // Dry run is beta and available since Kubernetes 1.13
        if err := c.Create(dummyPod, client.DryRunAll); err != nil {
            // Openshift 3.11 and K8S 1.12 don't support dryRun but gently returns "400 - BadRequest" in that case
            if errors.ReasonForError(err) == metav1.StatusReasonBadRequest {
                return nil

            }
            // If the Pod spec is invalid the expected error is 422 - UNPROCESSABLE ENTITY
            // But for K8S >= 1.13 we should not have an error here
            return fmt.Errorf("error while validating Pod for %s/%s: %v", expected.Namespace, expected.Name, err)
        }
        return nil
    }
}

This could be done in the PreCreate or PreUpdate of the StatefulSets reconciliation _(i.e. not during every reconcile loop)_ and avoid falling into this situation.

馃槻 That's a great idea @barkbay!

A few thoughts while I'm working on this one:

  • We can update the Elasticsearch Phase to Invalid as it is done when one of the existing validation fails :
NAME                                                                 HEALTH   NODES   VERSION   PHASE     AGE
elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-sample-ko   green    3       7.6.0     Invalid   4m50s

And ofc generate an event:

50s         Warning   ReconciliationError     elasticsearch/elasticsearch-sample-ko
Failed to apply spec change: Validation of PodTemplate for Elasticsearch default/elasticsearch-sample-ko failed for the following reasons:
[{FieldValueInvalid Invalid value: "BaDVolumeName": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?') spec.volumes[0].name} 
{FieldValueInvalid Invalid value: "4Gi": must be less than or equal to memory limit spec.containers[0].resources.requests}]
  • I think there is no need to do the same for other resources (Kibana,APM, Entsearch):

    • Validation is already done when the Deployment is created/updated

    • There is no Phase in the Status

    • There is no "expectations" in these cases, I don't think it is really useful

Was this page helpful?
0 / 5 - 0 ratings