As far as I understand, version >= 1.16 is supposed to support startupProbe, e.g. in template.spec.containers of a Deployment. However a simple deployment like
apiVersion: v1
kind: Namespace
metadata:
name: "test"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-sleep
namespace: test
spec:
selector:
matchLabels:
app: postgres-sleep
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 50%
template:
metadata:
labels:
app: postgres-sleep
spec:
containers:
- name: postgres-sleep
image: krichter/microk8s-startup-probe-ignored:latest
ports:
- name: postgres
containerPort: 5432
readinessProbe:
tcpSocket:
port: 5432
periodSeconds: 3
livenessProbe:
tcpSocket:
port: 5432
periodSeconds: 3
startupProbe:
tcpSocket:
port: 5432
failureThreshold: 60
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: postgres-sleep
namespace: test
spec:
selector:
app: httpd
ports:
- protocol: TCP
port: 5432
targetPort: 5432
---
with `krichter/microk8s-startup-probe-ignored` being
FROM postgres:11
CMD sleep 30 && postgres
doesn't contain the specified `startupProbe` according to the details from the dashboard
kind: Deployment
apiVersion: apps/v1
metadata:
name: postgres-sleep
namespace: test
selfLink: /apis/apps/v1/namespaces/test/deployments/postgres-sleep
uid: 507ff4f4-7551-403d-9af5-9d57f4b1f0e8
resourceVersion: '4175'
generation: 1
creationTimestamp: '2019-11-02T22:19:50Z'
annotations:
deployment.kubernetes.io/revision: '1'
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"postgres-sleep","namespace":"test"},"spec":{"replicas":2,"selector":{"matchLabels":{"app":"postgres-sleep"}},"strategy":{"rollingUpdate":{"maxSurge":1,"maxUnavailable":"50%"},"type":"RollingUpdate"},"template":{"metadata":{"labels":{"app":"postgres-sleep"}},"spec":{"containers":[{"image":"krichter/microk8s-startup-probe-ignored:latest","livenessProbe":{"periodSeconds":3,"tcpSocket":{"port":5432}},"name":"postgres-sleep","ports":[{"containerPort":5432,"name":"postgres"}],"readinessProbe":{"periodSeconds":3,"tcpSocket":{"port":5432}},"startupProbe":{"failureThreshold":60,"periodSeconds":10,"tcpSocket":{"port":5432}}}]}}}}
spec:
replicas: 2
selector:
matchLabels:
app: postgres-sleep
template:
metadata:
creationTimestamp: null
labels:
app: postgres-sleep
spec:
containers:
- name: postgres-sleep
image: 'krichter/microk8s-startup-probe-ignored:latest'
ports:
- name: postgres
containerPort: 5432
protocol: TCP
resources: {}
livenessProbe:
tcpSocket:
port: 5432
timeoutSeconds: 1
periodSeconds: 3
successThreshold: 1
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 5432
timeoutSeconds: 1
periodSeconds: 3
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 50%
maxSurge: 1
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 1
replicas: 2
updatedReplicas: 2
unavailableReplicas: 2
conditions:
- type: Available
status: 'False'
lastUpdateTime: '2019-11-02T22:19:50Z'
lastTransitionTime: '2019-11-02T22:19:50Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
- type: Progressing
status: 'True'
lastUpdateTime: '2019-11-02T22:19:51Z'
lastTransitionTime: '2019-11-02T22:19:50Z'
reason: ReplicaSetUpdated
message: ReplicaSet "postgres-sleep-d78bd9fdf" is progressing.
and immediately fails due to failing readiness probe rather than wait for startup probe to reach it's maximum or success which is what I expect.
Name: postgres-sleep-d78bd9fdf-xlx9v
Namespace: test
Priority: 0
Node: richter-lenovo-ideapad-z500/192.168.178.34
Start Time: Sat, 02 Nov 2019 23:19:51 +0100
Labels: app=postgres-sleep
pod-template-hash=d78bd9fdf
Annotations:
Status: Running
IP: 10.1.38.42
IPs:
IP: 10.1.38.42
Controlled By: ReplicaSet/postgres-sleep-d78bd9fdf
Containers:
postgres-sleep:
Container ID: containerd://6dea9ee3e3133400b0c28dfbddbd4d0b27cfc876a247b1ddf157b0d176d4c152
Image: krichter/microk8s-startup-probe-ignored:latest
Image ID: docker.io/krichter/microk8s-startup-probe-ignored@sha256:01d48777d2ce168ca8be6b607efd41743f9fd51959a613bae06756c60938cd34
Port: 5432/TCP
Host Port: 0/TCP
State: Running
Started: Sat, 02 Nov 2019 23:21:32 +0100
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 02 Nov 2019 23:20:58 +0100
Finished: Sat, 02 Nov 2019 23:21:28 +0100
Ready: False
Restart Count: 3
Liveness: tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
Readiness: tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jm8k6 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-jm8k6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-jm8k6
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled default-scheduler Successfully assigned test/postgres-sleep-d78bd9fdf-xlx9v to richter-lenovo-ideapad-z500
Normal Killing 106s kubelet, richter-lenovo-ideapad-z500 Container postgres-sleep failed liveness probe, will be restarted
Normal Pulling 83s (x2 over 115s) kubelet, richter-lenovo-ideapad-z500 Pulling image "krichter/microk8s-startup-probe-ignored:latest"
Normal Pulled 82s (x2 over 114s) kubelet, richter-lenovo-ideapad-z500 Successfully pulled image "krichter/microk8s-startup-probe-ignored:latest"
Normal Created 82s (x2 over 114s) kubelet, richter-lenovo-ideapad-z500 Created container postgres-sleep
Normal Started 82s (x2 over 114s) kubelet, richter-lenovo-ideapad-z500 Started container postgres-sleep
Warning Unhealthy 79s (x4 over 112s) kubelet, richter-lenovo-ideapad-z500 Liveness probe failed: dial tcp 10.1.38.42:5432: connect: connection refused
Warning Unhealthy 77s (x12 over 113s) kubelet, richter-lenovo-ideapad-z500 Readiness probe failed: dial tcp 10.1.38.42:5432: connect: connection refused
Name: postgres-sleep-d78bd9fdf-xsx64
Namespace: test
Priority: 0
Node: richter-lenovo-ideapad-z500/192.168.178.34
Start Time: Sat, 02 Nov 2019 23:19:51 +0100
Labels: app=postgres-sleep
pod-template-hash=d78bd9fdf
Annotations:
Status: Running
IP: 10.1.38.43
IPs:
IP: 10.1.38.43
Controlled By: ReplicaSet/postgres-sleep-d78bd9fdf
Containers:
postgres-sleep:
Container ID: containerd://436a39ef012d33b6c88ab4a3761dfe4787e0e62a2be7fa07936ba403684ab78d
Image: krichter/microk8s-startup-probe-ignored:latest
Image ID: docker.io/krichter/microk8s-startup-probe-ignored@sha256:01d48777d2ce168ca8be6b607efd41743f9fd51959a613bae06756c60938cd34
Port: 5432/TCP
Host Port: 0/TCP
State: Running
Started: Sat, 02 Nov 2019 23:21:33 +0100
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 02 Nov 2019 23:21:00 +0100
Finished: Sat, 02 Nov 2019 23:21:30 +0100
Ready: False
Restart Count: 3
Liveness: tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
Readiness: tcp-socket :5432 delay=0s timeout=1s period=3s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jm8k6 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-jm8k6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-jm8k6
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled default-scheduler Successfully assigned test/postgres-sleep-d78bd9fdf-xsx64 to richter-lenovo-ideapad-z500
Normal Killing 104s kubelet, richter-lenovo-ideapad-z500 Container postgres-sleep failed liveness probe, will be restarted
Normal Pulling 82s (x2 over 114s) kubelet, richter-lenovo-ideapad-z500 Pulling image "krichter/microk8s-startup-probe-ignored:latest"
Normal Pulled 80s (x2 over 112s) kubelet, richter-lenovo-ideapad-z500 Successfully pulled image "krichter/microk8s-startup-probe-ignored:latest"
Normal Created 80s (x2 over 112s) kubelet, richter-lenovo-ideapad-z500 Created container postgres-sleep
Normal Started 80s (x2 over 112s) kubelet, richter-lenovo-ideapad-z500 Started container postgres-sleep
Warning Unhealthy 77s (x11 over 110s) kubelet, richter-lenovo-ideapad-z500 Readiness probe failed: dial tcp 10.1.38.43:5432: connect: connection refused
Warning Unhealthy 74s (x5 over 110s) kubelet, richter-lenovo-ideapad-z500 Liveness probe failed: dial tcp 10.1.38.43:5432: connect: connection refused
experienced with v1.17.0-alpha.3 (1025) and v1.16.2 (1019) on Ubuntu 19.10
inspection-report-20191102_232756.tar.gz
@krichter722 this requires explicit enabling as its an alpha feature in 1.16.
In the kubelet args https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/default-args/kubelet
You can try adding --feature-gates="StartupProbe" and then restart the kubelet sudo systemctl restart snap.microk8s.daemon-kubelet
@balchua Thanks for the information. Afaik, I need to rebuilt the snap in order to be able to edit that file since snap doesn't allow editing any file, not even configuration files, right?
I'll already close this issue since there's no problem with the software.
@krichter722 sorry didn't give the right location to edit the file. Go to this location /var/snap/microk8s/current/args/kubelet and edit the file.
@balchua I was too hasty. I changed
--feature-gates=DevicePlugins=true
to
--feature-gates="DevicePlugins=true,StartupProbe=true"
in /var/snap/microk8s/current/args/kubelet following https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/ and rebooted Ubuntu, but the given example neither contains the startupProbe nor does it behave as it'd have it. Therefore I assume it's a bug and I'll reopen the issue.
Maybe this feature gate needs to be added to the apiserver too. Can try adding this feature gate to var/snap/microk8s/current/args/kube-apiserver
Then do a microk8s.stop and then microk8s.start.
Would be nice if the documentation states that this is currently alpha and not enabled by default... Just wasted quite some time on this. Thanks for the feature-gates URL @krichter722 , will definitely check that next time something doesn't seem to work.
@krichter722 to enable the StartupProbe you need to edit the /var/snap/microk8s/current/args/kube-apiserver file that holds the arguments of the API server and add the line --feature-gates="StartupProbe=true". After that you have to restart MicroK8s with microk8s.stop; microk8s.start. Finally you have to apply your yaml.
These steps apply for any feature gate not enabled by default in Kubernetes.
@ktsakalozos Thanks for your support. I'll close this as there's no issue and I have no good idea how to improve the documentation.
Most helpful comment
@krichter722 this requires explicit enabling as its an alpha feature in 1.16.
In the kubelet args https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/default-args/kubelet
You can try adding
--feature-gates="StartupProbe"and then restart the kubeletsudo systemctl restart snap.microk8s.daemon-kubelet