Affecting >v3.2.
Might need make this error message more descriptive in etcdserver side.
2017-11-01 16:31:59.764751 I | etcdmain: etcd Version: 3.2.9
2017-11-01 16:31:59.764799 I | etcdmain: Git SHA: f1d7dd8
2017-11-01 16:31:59.764803 I | etcdmain: Go Version: go1.8.4
2017-11-01 16:31:59.764806 I | etcdmain: Go OS/Arch: linux/amd64
2017-11-01 16:31:59.764809 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2017-11-01 16:31:59.764837 I | embed: peerTLS: cert = /etc/etcdtls/member/peer-tls/peer.crt, key = /etc/etcdtls/member/peer-tls/peer.key, ca = , trusted-ca = /etc/etcdtls/member/peer-tls/peer-ca.crt, client-cert-auth = true
2017-11-01 16:31:59.765511 I | embed: listening for peers on https://0.0.0.0:2380
2017-11-01 16:31:59.765551 I | embed: listening for client requests on 0.0.0.0:2379
2017-11-01 16:31:59.792577 W | etcdserver: could not get cluster response from https://example-0000.example.default.svc:2380: Get https://example-0000.example.default.svc:2380/members: EOF
2017-11-01 16:31:59.798122 C | etcdmain: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given urls
Related
/cc @hongchaodeng
@hongchaodeng Server-side error message when client gets rejected with EOF is
etcdmain: rejected connection from "10.60.8.250:37528" (tls: "10.60.8.250" does not match any of DNSNames [".example.default.svc" ".example.default.svc.cluster.local"] (lookup 250.8.60.10.in-addr.arpa. on 10.63.240.10:53: no such host))
@gyuho I am going to take a stab at this.
@gyuho in this case the rejected connection message is coming from etcdserver/cluster_util.go:getClusterFromRemotePeers.
resp, err := cc.Get(u + "/members")
if err != nil {
if logerr {
plog.Warningf("could not get cluster response from %s: %v", u, err)
}
continue
}
The err returned from this is EOF presumably re: https://github.com/golang/go/issues/19874 as you noted above. As far as approach goes should we try to handle the proper error message here or earlier? Would checking for a literal EOF make sense?
@hexfusion Thanks for looking it up! Yeah seems like EOF is expected when the TLS handshake is failed on the other side (ref. https://github.com/coreos/etcd/pull/7687#discussion_r110499257). We have TLS handshake failure handler in listener
But in this case, cc.Get(u + "/members") is the dialer on the other side, and there's no way to configure handshake failure dialer (unless we have a wrapper around roundtripper, in a similar way we do for listener, but I don't think it's worth since the other peer clearly prints out the error):
Closing since there's not much we can do.