I've noticed this behavior in my clusters, right now on calico 3.12 but it looks to be going back to earlier versions
No Liveness and Readiness warnings
All calico node pods are running 1/1 (no restarts)
Warning Unhealthy 50m (x789 over 39d) kubelet, w4r008 Liveness probe failed: calico/node is not ready: Felix is not live: Get http://localhost:9099/liveness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 25m (x725 over 38d) kubelet, w4r008 Readiness probe failed: calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 6m17s (x723 over 39d) kubelet, 0r002 Liveness probe failed: calico/node is not ready: Felix is not live: Get http://localhost:9099/liveness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 52m (x820 over 39d) kubelet, 003 Liveness probe failed: calico/node is not ready: Felix is not live: Get http://localhost:9099/liveness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 47m (x794 over 39d) kubelet, 003 Readiness probe failed: calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 47m (x778 over 39d) kubelet, 003 Readiness probe failed: calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 9m16s (x744 over 39d) kubelet, 003 Liveness probe failed: calico/node is not ready: Felix is not live: Get http://localhost:9099/liveiness: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
When curling the endpoint on the affected node:
$ curl -v http://localhost:9099/readiness
GET /readiness HTTP/1.1
Host: localhost:9099
User-Agent: curl/7.65.0
Accept: /
- Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Date: Tue, 07 Apr 2020 12:37:39 GMT
<- Connection #0 to host localhost left intact
When running calico health check within docker container:
calico-node -bird-ready -bird-live
2020-04-07 12:51:32.464 [INFO][8408] health.go 156: Number of node(s) with BGP peering established = 15
calico-node -felix-ready -felix-live
We are pretty much using the generic yaml from calico repo
Trying to make sure the probes are working as expected
Any suggestions what to look at? thanks!
needs more investigation on my end..
needs more investigation on my end..
any resolution to this? this might help who encountered the same issue. thanks!
@cann0nf0dder any updates?
have a new install and have this error.
I encounter the same issue.
I checked the port 9099 was not existing (with command: netstat -antpu | grep 9099 ) , may I know which process will listen to port 9099 ?
Most helpful comment
any resolution to this? this might help who encountered the same issue. thanks!