Microk8s: startupProbe ignored

Created on 2 Nov 2019  路  8Comments  路  Source: ubuntu/microk8s

As far as I understand, version >= 1.16 is supposed to support startupProbe, e.g. in template.spec.containers of a Deployment. However a simple deployment like

apiVersion: v1
kind: Namespace
metadata:
  name: "test"
---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres-sleep
  namespace: test
spec:
  selector:
    matchLabels:
      app: postgres-sleep
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 50%
  template:
    metadata:
      labels:
        app: postgres-sleep
    spec:
      containers:
        - name: postgres-sleep
          image: krichter/microk8s-startup-probe-ignored:latest
          ports:
            - name: postgres
              containerPort: 5432
          readinessProbe:
            tcpSocket:
              port: 5432
            periodSeconds: 3
          livenessProbe:
            tcpSocket:
              port: 5432
            periodSeconds: 3
          startupProbe:
            tcpSocket:
              port: 5432
            failureThreshold: 60
            periodSeconds: 10
---

apiVersion: v1
kind: Service
metadata:
  name: postgres-sleep
  namespace: test
spec:
  selector:
    app: httpd
  ports:
    - protocol: TCP
      port: 5432
      targetPort: 5432
---
with `krichter/microk8s-startup-probe-ignored` being
FROM postgres:11
CMD sleep 30 && postgres
doesn't contain the specified `startupProbe` according to the details from the dashboard
kind: Deployment
apiVersion: apps/v1
metadata:
  name: postgres-sleep
  namespace: test
  selfLink: /apis/apps/v1/namespaces/test/deployments/postgres-sleep
  uid: 507ff4f4-7551-403d-9af5-9d57f4b1f0e8
  resourceVersion: '4175'
  generation: 1
  creationTimestamp: '2019-11-02T22:19:50Z'
  annotations:
    deployment.kubernetes.io/revision: '1'
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"postgres-sleep","namespace":"test"},"spec":{"replicas":2,"selector":{"matchLabels":{"app":"postgres-sleep"}},"strategy":{"rollingUpdate":{"maxSurge":1,"maxUnavailable":"50%"},"type":"RollingUpdate"},"template":{"metadata":{"labels":{"app":"postgres-sleep"}},"spec":{"containers":[{"image":"krichter/microk8s-startup-probe-ignored:latest","livenessProbe":{"periodSeconds":3,"tcpSocket":{"port":5432}},"name":"postgres-sleep","ports":[{"containerPort":5432,"name":"postgres"}],"readinessProbe":{"periodSeconds":3,"tcpSocket":{"port":5432}},"startupProbe":{"failureThreshold":60,"periodSeconds":10,"tcpSocket":{"port":5432}}}]}}}}
spec:
  replicas: 2
  selector:
    matchLabels:
      app: postgres-sleep
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: postgres-sleep
    spec:
      containers:
        - name: postgres-sleep
          image: 'krichter/microk8s-startup-probe-ignored:latest'
          ports:
            - name: postgres
              containerPort: 5432
              protocol: TCP
          resources: {}
          livenessProbe:
            tcpSocket:
              port: 5432
            timeoutSeconds: 1
            periodSeconds: 3
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            tcpSocket:
              port: 5432
            timeoutSeconds: 1
            periodSeconds: 3
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: Always
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext: {}
      schedulerName: default-scheduler
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 50%
      maxSurge: 1
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
status:
  observedGeneration: 1
  replicas: 2
  updatedReplicas: 2
  unavailableReplicas: 2
  conditions:
    - type: Available
      status: 'False'
      lastUpdateTime: '2019-11-02T22:19:50Z'
      lastTransitionTime: '2019-11-02T22:19:50Z'
      reason: MinimumReplicasUnavailable
      message: Deployment does not have minimum availability.
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2019-11-02T22:19:51Z'
      lastTransitionTime: '2019-11-02T22:19:50Z'
      reason: ReplicaSetUpdated
      message: ReplicaSet "postgres-sleep-d78bd9fdf" is progressing.
and immediately fails due to failing readiness probe rather than wait for startup probe to reach it's maximum or success which is what I expect.
kubectl describe pods -n test
Name:         postgres-sleep-d78bd9fdf-xlx9v
Namespace:    test
Priority:     0
Node:         richter-lenovo-ideapad-z500/192.168.178.34
Start Time:   Sat, 02 Nov 2019 23:19:51 +0100
Labels:       app=postgres-sleep
              pod-template-hash=d78bd9fdf
Annotations:  
Status:       Running
IP:           10.1.38.42
IPs:
  IP:           10.1.38.42
Controlled By:  ReplicaSet/postgres-sleep-d78bd9fdf
Containers:
  postgres-sleep:
    Container ID:   containerd://6dea9ee3e3133400b0c28dfbddbd4d0b27cfc876a247b1ddf157b0d176d4c152
    Image:          krichter/microk8s-startup-probe-ignored:latest
    Image ID:       docker.io/krichter/microk8s-startup-probe-ignored@sha256:01d48777d2ce168ca8be6b607efd41743f9fd51959a613bae06756c60938cd34
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sat, 02 Nov 2019 23:21:32 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 02 Nov 2019 23:20:58 +0100
      Finished:     Sat, 02 Nov 2019 23:21:28 +0100
    Ready:          False
    Restart Count:  3
    Liveness:       tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
    Readiness:      tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
    Environment:    
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jm8k6 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-jm8k6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-jm8k6
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From                                  Message
  ----     ------     ----                 ----                                  -------
  Normal   Scheduled              default-scheduler                     Successfully assigned test/postgres-sleep-d78bd9fdf-xlx9v to richter-lenovo-ideapad-z500
  Normal   Killing    106s                 kubelet, richter-lenovo-ideapad-z500  Container postgres-sleep failed liveness probe, will be restarted
  Normal   Pulling    83s (x2 over 115s)   kubelet, richter-lenovo-ideapad-z500  Pulling image "krichter/microk8s-startup-probe-ignored:latest"
  Normal   Pulled     82s (x2 over 114s)   kubelet, richter-lenovo-ideapad-z500  Successfully pulled image "krichter/microk8s-startup-probe-ignored:latest"
  Normal   Created    82s (x2 over 114s)   kubelet, richter-lenovo-ideapad-z500  Created container postgres-sleep
  Normal   Started    82s (x2 over 114s)   kubelet, richter-lenovo-ideapad-z500  Started container postgres-sleep
  Warning  Unhealthy  79s (x4 over 112s)   kubelet, richter-lenovo-ideapad-z500  Liveness probe failed: dial tcp 10.1.38.42:5432: connect: connection refused
  Warning  Unhealthy  77s (x12 over 113s)  kubelet, richter-lenovo-ideapad-z500  Readiness probe failed: dial tcp 10.1.38.42:5432: connect: connection refused


Name:         postgres-sleep-d78bd9fdf-xsx64
Namespace:    test
Priority:     0
Node:         richter-lenovo-ideapad-z500/192.168.178.34
Start Time:   Sat, 02 Nov 2019 23:19:51 +0100
Labels:       app=postgres-sleep
              pod-template-hash=d78bd9fdf
Annotations:  
Status:       Running
IP:           10.1.38.43
IPs:
  IP:           10.1.38.43
Controlled By:  ReplicaSet/postgres-sleep-d78bd9fdf
Containers:
  postgres-sleep:
    Container ID:   containerd://436a39ef012d33b6c88ab4a3761dfe4787e0e62a2be7fa07936ba403684ab78d
    Image:          krichter/microk8s-startup-probe-ignored:latest
    Image ID:       docker.io/krichter/microk8s-startup-probe-ignored@sha256:01d48777d2ce168ca8be6b607efd41743f9fd51959a613bae06756c60938cd34
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sat, 02 Nov 2019 23:21:33 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 02 Nov 2019 23:21:00 +0100
      Finished:     Sat, 02 Nov 2019 23:21:30 +0100
    Ready:          False
    Restart Count:  3
    Liveness:       tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
    Readiness:      tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
    Environment:    
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jm8k6 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-jm8k6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-jm8k6
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From                                  Message
  ----     ------     ----                 ----                                  -------
  Normal   Scheduled              default-scheduler                     Successfully assigned test/postgres-sleep-d78bd9fdf-xsx64 to richter-lenovo-ideapad-z500
  Normal   Killing    104s                 kubelet, richter-lenovo-ideapad-z500  Container postgres-sleep failed liveness probe, will be restarted
  Normal   Pulling    82s (x2 over 114s)   kubelet, richter-lenovo-ideapad-z500  Pulling image "krichter/microk8s-startup-probe-ignored:latest"
  Normal   Pulled     80s (x2 over 112s)   kubelet, richter-lenovo-ideapad-z500  Successfully pulled image "krichter/microk8s-startup-probe-ignored:latest"
  Normal   Created    80s (x2 over 112s)   kubelet, richter-lenovo-ideapad-z500  Created container postgres-sleep
  Normal   Started    80s (x2 over 112s)   kubelet, richter-lenovo-ideapad-z500  Started container postgres-sleep
  Warning  Unhealthy  77s (x11 over 110s)  kubelet, richter-lenovo-ideapad-z500  Readiness probe failed: dial tcp 10.1.38.43:5432: connect: connection refused
  Warning  Unhealthy  74s (x5 over 110s)   kubelet, richter-lenovo-ideapad-z500  Liveness probe failed: dial tcp 10.1.38.43:5432: connect: connection refused

experienced with v1.17.0-alpha.3 (1025) and v1.16.2 (1019) on Ubuntu 19.10
inspection-report-20191102_232756.tar.gz

Q&A

Most helpful comment

@krichter722 this requires explicit enabling as its an alpha feature in 1.16.
In the kubelet args https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/default-args/kubelet
You can try adding --feature-gates="StartupProbe" and then restart the kubelet sudo systemctl restart snap.microk8s.daemon-kubelet

All 8 comments

@krichter722 this requires explicit enabling as its an alpha feature in 1.16.
In the kubelet args https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/default-args/kubelet
You can try adding --feature-gates="StartupProbe" and then restart the kubelet sudo systemctl restart snap.microk8s.daemon-kubelet

@balchua Thanks for the information. Afaik, I need to rebuilt the snap in order to be able to edit that file since snap doesn't allow editing any file, not even configuration files, right?

I'll already close this issue since there's no problem with the software.

@krichter722 sorry didn't give the right location to edit the file. Go to this location /var/snap/microk8s/current/args/kubelet and edit the file.

@balchua I was too hasty. I changed

--feature-gates=DevicePlugins=true

to

--feature-gates="DevicePlugins=true,StartupProbe=true"

in /var/snap/microk8s/current/args/kubelet following https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/ and rebooted Ubuntu, but the given example neither contains the startupProbe nor does it behave as it'd have it. Therefore I assume it's a bug and I'll reopen the issue.

Maybe this feature gate needs to be added to the apiserver too. Can try adding this feature gate to var/snap/microk8s/current/args/kube-apiserver
Then do a microk8s.stop and then microk8s.start.

Would be nice if the documentation states that this is currently alpha and not enabled by default... Just wasted quite some time on this. Thanks for the feature-gates URL @krichter722 , will definitely check that next time something doesn't seem to work.

@krichter722 to enable the StartupProbe you need to edit the /var/snap/microk8s/current/args/kube-apiserver file that holds the arguments of the API server and add the line --feature-gates="StartupProbe=true". After that you have to restart MicroK8s with microk8s.stop; microk8s.start. Finally you have to apply your yaml.

These steps apply for any feature gate not enabled by default in Kubernetes.

@ktsakalozos Thanks for your support. I'll close this as there's no issue and I have no good idea how to improve the documentation.

Was this page helpful?
0 / 5 - 0 ratings