kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:54:15Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Cluster:
1x master
2x worker-nodes
kubectl:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
I have been struggling with an etcd warning triggered when I try to reset my kubernetes (version 1.18) cluster as you see in the following markup:
[root@master ~]# kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
[reset] Removing info for node "master" from the ConfigMap "kubeadm-config" in the "kube-system" Namespace
{"level":"warn","ts":"2020-04-23T16:37:29.913+0200","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-d62d7670-0929-417d-9fa7-9e72f59ad0e0/192.168.10.100:2379","attempt":0,"error":"rpc error: code = Unknown desc = etcdserver: re-configuration failed due to not enough started members"}
This warning is printed out 10-15 times and then the cluster can finally reset itself afterwards. I did not experience this warning when working with version 1.16.
The cluster is reset without etcd warnings
kubeadm init
kubeadm reset
thanks for logging the issue @UgurTheG
to add some detail we started seeing these non-verbose warnings when kubeadm performs operations using the etcd client we bundle with k8s / kubeadm 1.18:
for example:
> I0423 18:34:18.934385 3883 etcd.go:327] Failed to remove etcd member: etcdserver: re-configuration failed due to not enough started members
{"level":"warn","ts":"2020-04-23T18:34:25.534+0300","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-335d5851-7f90-4dce-8eed-77acc5a459ae/192.168.0.102:2379","attempt":0,"error":"rpc error: code = Unknown desc = etcdserver: re-configuration failed due to not enough started members"}
etcd 3.3.11
i think this is the actual etcd client vendor for 1.18:
go.etcd.io/etcd => go.etcd.io/etcd v0.0.0-20191023171146-3cf2f69b5738 // 3cf2f69b5738 is the SHA for git tag v3.4.3
is there a way to silence these warnings by default?
we had a proposal PR that vendors zap in k/k and passes a nop-zap config to the etcd client, but ideally they should be silenced by default:
https://github.com/kubernetes/kubernetes/pull/90164
cc @jpbetz @SataQiu
Thanks for answering @neolit123
Yes, I read my version wrong. I do have the 3.4.3 version of etcd. Unfortunately I have not found a way to silence these warnings myself.
etcd config has a StrictReconfigCheck flag, It is true by default,it will reject removal if leads to quorum loss and return ErrNotEnoughStartedMembers to clients.
func (s *EtcdServer) RemoveMember(ctx context.Context, id uint64) ([]*membership.Member, error) {
if err := s.checkMembershipOperationPermission(ctx); err != nil {
return nil, err
}
// by default StrictReconfigCheck is enabled; reject removal if leads to quorum loss
if err := s.mayRemoveMember(types.ID(id)); err != nil {
return nil, err
}
cc := raftpb.ConfChange{
Type: raftpb.ConfChangeRemoveNode,
NodeID: id,
}
return s.configure(ctx, cc)
}
if !s.cluster.IsReadyToRemoveVotingMember(uint64(id)) {
if lg := s.getLogger(); lg != nil {
lg.Warn(
"rejecting member remove request; not enough healthy members",
zap.String("local-member-id", s.ID().String()),
zap.String("requested-member-remove-id", id.String()),
zap.Error(ErrNotEnoughStartedMembers),
)
} else {
plog.Warningf("not enough started members, rejecting remove member %s", id)
}
return ErrNotEnoughStartedMembers
}
https://github.com/etcd-io/etcd/blob/release-3.4/etcdserver/server.go#L1800:22
@neolit123 @UgurTheG
@neolit123 The default logging level for etcd client is INFO. I see two options:
Option 1 is probably the right way to do it, unless people feel it is wrong to log this particular message at WARN level. If we do not want to vendor zap into k/k, maybe we can add a flag to configure logging level for etcd client.
cc @gyuho
thank you for the explanation, @tangcong and @jingyih
it's unclear if the k/k dependency maintainers will be convinced that zap should now be vendored.
maybe it's going to be included in the future, regardless of the kubeadm case.
making it possible to configure only etcd client verbosity level with a field in the etcd client makes sense to me, yet if zap.Config is the preferred way in etcd then i guess we have to go that route.
@neolit123 @tangcong @jingyih I think I found the answer to my problem. I had not have a daemon.json file for docker in /etc/docker/. So therefore I created that daemon.json file and restarted docker and kubelet. There are no warnings anymore when reseting cluster with kubeadm.
for kubeadm 1.19 we added some logic to not try to remove the last member in the cluster, which was the primary trigger for the warnings.
the potential configuration of log level per etcd client instance seems still useful.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
Most helpful comment
@neolit123 The default logging level for etcd client is INFO. I see two options:
Option 1 is probably the right way to do it, unless people feel it is wrong to log this particular message at WARN level. If we do not want to vendor zap into k/k, maybe we can add a flag to configure logging level for etcd client.
cc @gyuho