Etcd proxy versions:
3.2.17 - installed
3.3.12 - also tested to rule out version issues
Etcd cluster version:
3.2.17
Etcd won't allow connections in proxy mode. My environment is the same as previously described in issue https://github.com/zalando/patroni/issues/1009 an etcd cluster of 5 nodes, 4 PostgreSQL bare metals (BM) running https://github.com/zalando/patroni
Two of the BMs are able to start etcd in proxy mode and Patroni can successfully communicate with the etcd cluster and provision PG. However, two of the nodes (exact same etcd proxy configuration listening in 127.0.0.1:2379) would initially connect to the etcd cluster with dns provisioning but then reject further connections from Patroni with the following message log on each respective node
node-001 etcd[20667]: rejected connection from "127.0.0.1:41156" (error "EOF", ServerName "")
node-001 etcd[20667]: rejected connection from "127.0.0.1:41164" (error "EOF", ServerName "")
node-001 etcd[20667]: rejected connection from "127.0.0.1:41162" (error "EOF", ServerName "")
node-001 etcd[20667]: rejected connection from "127.0.0.1:41158" (error "EOF", ServerName "")
node-001 etcd[20667]: rejected connection from "127.0.0.1:41168" (error "EOF", ServerName "")
node-001 etcd[20667]: rejected connection from "127.0.0.1:41170" (error "EOF", ServerName "")
etcdctl is still able to perfectly communicate with the cluster from all 4 nodes. SSL/TLS has been troubleshooted and we don't see any problems with it.
etcd servers are not login anything helpful and we are running out of ideas here.
Any help will be appreciated.
Update:
If I point patroni directly to the etcd cluster I get the following error on the /var/log/syslog for each of the etcd cluster nodes.
patroni[30360]: 2019-04-11 19:16:18,104 ERROR: Failed to get list of machines from https://172.16.45.247:2379/v2: MaxRetryError("HTTPSConnectionPool(host='172.16.45.247', port=2379): Max retries exceeded with url: /v2/machines (Caused by ProtocolError('Connection aborted.', PermissionError(13, 'Permission denied')))",)
Found the problem, the key file was not readable by the Patroni process. So, morale of the story, if you are sharing SSL/TLS certs and keys with two services that work together on a common goal, make sure both have at minimum read access to the files.
SSL issues are a pain in the rear to troubleshoot, the error messages are far more deceiving than you would think.
Most helpful comment
Found the problem, the key file was not readable by the Patroni process. So, morale of the story, if you are sharing SSL/TLS certs and keys with two services that work together on a common goal, make sure both have at minimum read access to the files.
SSL issues are a pain in the rear to troubleshoot, the error messages are far more deceiving than you would think.