I am running into an odd issue with etcd watches and not able to understand the behavior that we are seeing. We connect to etcd over vpn and our etcd sits behind a HA proxy, and are observing these 2 scenarios:
Open etcdclt watch <path> in one window and loop etcdctl put <path> in another window and put messages and see that the watch is able to get these messages
scenario A: disconnect the VPN connectivity from VPN client
watch remains connected even after VPN connection is disconnected. Reconnect the VPN, and try putting more messages. The watch does not seem to be reporting any of the messages though get confirms that the messages are there.
scenario B: switch off internet connectivity by turning off WiFi
watch remains connected even after network is disconnected. Reconnect the network, and try putting more messages. In this case the watch reports all the new messages that get put.
etcd watch API is not meant for detecting connection issues.
You can use Session.
Disconnect should be handled in client balancer layer.
We've added HTTP/2 keepalive and client balancer health checking.
Once stabilized, it will be released in v3.3.
@gyuho Fair enough. How about if we use the official golang client. I am broadly under the assumption that the watch blocks till the connection is reestablished, broadly from this comment https://github.com/coreos/etcd/issues/7860#issuecomment-317368084
we use the golang client so I am assuming these are already handled for us
Disconnect should be handled in client balancer layer.
We've added HTTP/2 keepalive and client balancer health checking.
Once stabilized, it will be released in v3.3.
Alternatively can you provide examples, or refer me to some sort of documentation.
my understanding is that with the go client the watch remains blocked on connection failure and then can continue back when the connection comes back. you think this is a valid ?
@atinsood If watch is issued with WithCreatedNotify https://godoc.org/github.com/coreos/etcd/clientv3#WithCreatedNotify, you can wait for watch create event, thus waiting for initial connection. And https://godoc.org/github.com/coreos/etcd/clientv3#WithProgressNotify.
I suggest reading this thread https://github.com/coreos/etcd/issues/8495#issuecomment-327242932.
@gyuho I have tried to spend some more time understanding this and I am still confused. The official golang etcd v3 client, does it auto reconnect on watch failure due to connectivity issues or does the client code needs to handle that.
sorry for being persistent with the question, wasn't able to find any good guidance on this. I see this comment here, which broadly talks about handling errors, https://github.com/coreos/etcd/issues/8495#issuecomment-327241587 but are there any graceeful examples of how to reconnect the watch
Most helpful comment
@gyuho I have tried to spend some more time understanding this and I am still confused. The official golang etcd v3 client, does it auto reconnect on watch failure due to connectivity issues or does the client code needs to handle that.
sorry for being persistent with the question, wasn't able to find any good guidance on this. I see this comment here, which broadly talks about handling errors, https://github.com/coreos/etcd/issues/8495#issuecomment-327241587 but are there any graceeful examples of how to reconnect the watch