BUG REPORT
kubeadm version: kubeadm v1.17.*
While executing Cluster API tests, in some cases it was observed kubeadm join failures immediately after printing:
...
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
xref https://github.com/kubernetes-sigs/cluster-api/issues/2769
To make kubelet-start more resilient by:
This error happens only sometimes, most probably when there is a temporary blackout of the load balancer that sits in front of the API servers (HA proxy reloading his configuration).
Important: if possible the change should be kept as small and possible and backported
Ok, there is a bug here. In theory the call to WaitForKubeletAndFunc should wait for both the kubelet healthz and waitForTLSBootstrappedClient to state that things are OK. However:
WaitForKubeletAndFunc would wait for the first one of those to complete to returnwaitForTLSBootstrappedClient basically only constructs a Clientset from local files that were already written down on disk. Hence, it does not do any network operations and completes with a success in a negligible amount of time.~ (EDIT: My test env was dirty and I was looking at the wrong file...)~Thus, the WaitForKubeletAndFunc returns when it had only checked that a Clientset is creatable. It does not verify at all if the kubelet is running.~
On the question of the healthz timeout. There is an initial wait of 40s, after which the first connection attempt is done. There are a total of 5 attempts with exponential backoff (initial delay of 5s and a factor of 2). Thus the total timeout here is 40s + 155s = 195s or 3m 15s.
But we don't wait for that.
/assign
/lifecycle active
With my above message in light and with a proper clean environment (with kind) I figured out that the healthz check is not appropriate for us. In the kubelet that check is just a ping check (it will return 200 OK as long as the kubelet is running). Hence, it would return OK even if the kubelet is not bootstrapped yet.
However, due to the fact that WaitForKubeletAndFunc would return on any (not all) conditions, the healthz check is going to be executed after just 40 seconds and WaitForKubeletAndFunc would return without waiting for the bootstrap to be completed.
In my opinion we should stop using healthz while waiting for the bootstrap to finish and use only the existence of the bootstrapped kubelet.conf file (what waitForTLSBootstrappedClient does).
In my opinion we should stop using healthz while waiting for the bootstrap to finish and use only the existence of the bootstrapped kubelet.conf file (what waitForTLSBootstrappedClient does).
+1 waiting for the file to appear seems like a good indicator.
i don't think we have to validate the kubeconfig again, because we are doing that in the discovery part.
I'm not sure that getting read of WaitForKubeletAndFunc is the right thing to do.
As far as I understand, the current logics wait for the TLSBootstrapped, but if after some times the kubelet is not healthy, we exit fast.
So the healthz responding does not makes the wait loop to finish, but the healthz failing after a grace period of 40s yes
see also https://github.com/kubernetes/kubernetes/pull/89735#discussion_r401704557
You can see the current implementation here. Basically it waits for the first func (between waitForTLSBootstrappedClient and WaitForHealthyKubelet) to finish and then it returns.
With the current behavior of kubelet's healthz probe and in the case of slow API Server it's entirely possible that WaitForHealthyKubelet would exit first after a 40s wait, but before the bootstrap process is completed.
On the other hand, we can modify WaitForKubeletAndFunc to actually wait for both funcs to finish (instead of just one).
@rosti, the WaitForHealthyKubelet triggers exit (returns to the channel) only in case of errors
https://github.com/kubernetes/kubernetes/blob/8ae26096f7714b8ba95f62cf5018073dde2d1dfb/cmd/kubeadm/app/util/apiclient/wait.go#L163-L165
@rosti, the
WaitForHealthyKubelettriggers exit (returns to the channel) only in case of errors
https://github.com/kubernetes/kubernetes/blob/8ae26096f7714b8ba95f62cf5018073dde2d1dfb/cmd/kubeadm/app/util/apiclient/wait.go#L163-L165
Sorry about that, I looked into the code again and it indeed was writing to the channel only upon an error. So you are right about it.
@rosti re-opening because we are still missing:
/reopen
This basically means to get the following two calls moved in a separate func and wrapped in their retry statement :
IMO, a good initial setting for this operation is something like:
// An exponential backoff configuration that returns durations for a total time of ~40s.
// Example: 0, .5s, 1.2s, 2.3s, 4s, 6s, 10s, 16s, 24s, 37s
// Jitter is added as a random fraction of the duration multiplied by the jitter factor.
return wait.Backoff{
Duration: 500 * time.Millisecond,
Factor: 1.5,
Steps: 10,
Jitter: 0.4,
}
@fabriziopandini: Reopened this issue.
In response to this:
@rosti re-opening because we are still missing:
- changing the code in order that also annotate CRI is retried as well.
/reopen
This basically means to get the following two calls moved in a separate func and wrapped in their retry statement :
IMO, a good initial setting for this operation is something like:
// An exponential backoff configuration that returns durations for a total time of ~40s. // Example: 0, .5s, 1.2s, 2.3s, 4s, 6s, 10s, 16s, 24s, 37s // Jitter is added as a random fraction of the duration multiplied by the jitter factor. return wait.Backoff{ Duration: 500 * time.Millisecond, Factor: 1.5, Steps: 10, Jitter: 0.4, }
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
PS. Sorry if I missed this when reviewing kubernetes/kubernetes#89735
@fabriziopandini it's already retried using a polling timeout (see here). It's currently at 500ms retry interval for a 2 minute timeout. It can receive the same treatment as above - bump the retry interval to 5s and the total timeout to 5 mins.
@rosti I think we are good for patch node (even if 500ms is somehow aggressive)
Instead ClientSetFromFile is not retried.
If this call is not affected by temporary connection problems, we are fine. Otherwise, this call should be retried as well
@fabriziopandini @rosti
did we:
let me know how i can help.
closing as cherry picks are LGTM/approved
/close
@neolit123: Closing this issue.
In response to this:
closing as cherry picks are LGTM/approved
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.