I tried to set up an etcd cluster with https only addresses.
My etcd.service file looks like this:
[Unit]
Description=etcd key-value store
Documentation=https://github.com/coreos/etcd
[Service]
User=root
Type=notify
ExecStart=/usr/local/bin/etcd \
--name etcd0 \
--data-dir /var/lib/etcd \
--listen-peer-urls https://0.0.0.0:2380 \
--listen-client-urls https://0.0.0.0:2379 \
--advertise-client-urls https://172.30.102.53:2379,https://172.30.102.53:4001 \
--initial-advertise-peer-urls https://172.30.102.53:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcd0=https://172.30.102.53:2380,etcd1=https://172.30.102.39:2380,etcd2=https://172.30.102.57:2380 \
--initial-cluster-state existing \
--heartbeat-interval 1000 \
--election-timeout 5000 \
--peer-client-cert-auth \
--client-cert-auth \
--cert-file /etc/ssl/etcd/etcd.pem \
--peer-cert-file /etc/ssl/etcd/etcd.pem \
--peer-key-file /etc/ssl/etcd/etcd.key \
--key-file=/etc/ssl/etcd/etcd.key \
--trusted-ca-file=/etc/ssl/etcd/CA.pem \
--peer-trusted-ca-file=/etc/ssl/etcd/CA.pem
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
My output when starting the service is:
ClientTLS: cert = /etc/ssl/etcd/etcd.pem, key = /etc/ssl/etcd/etcd.key, ca = , trusted-ca = /etc/ssl/etcd/CA.pem, client-cert-auth = true, crl-file =
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48228" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48226" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44712" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44714" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48238" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48236" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44720" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44722" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48244" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48246" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44730" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44732" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44734" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48252" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48254" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44740" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44742" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48260" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48262" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44748" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44750" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48268" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.57:48270" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 14:35:40 etcd[19990]: rejected connection from "172.30.102.39:44756" (error "tls: first record does not look like a TLS handshake", ServerName "")
... and so on ...
Between those messages it also says:
```
publish error: etcdserver: request timed out
Sep 27 14:40:31 etcd[20359]: health check for peer 179451360aef27e2 could not connect: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x
Sep 27 14:40:31 etcd[20359]: health check for peer 4ee5b5f533b5a26e could not connect: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x
Sep 27 14:40:31systemd[1]: etcd.service start operation timed out. Terminating.
Sep 27 14:40:31 systemd[1]: Failed to start etcd key-value store.
I already checked the certificates using "openssl s_client -showcerts -connect [IP]:2380 -cert etcd.pem -key etcd.key -CAfile CA.pem" which seems so be working fine.
I'm using the following extensions config:
[ ca ]
keyUsage = critical, cRLSign, keyCertSign
basicConstraints = CA:TRUE, pathlen:0
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer:always
[ server ]
keyUsage = critical,digitalSignature,keyEncipherment
extendedKeyUsage = serverAuth,clientAuth
basicConstraints = critical,CA:FALSE
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
```
I really can't tell what the problem is. Any ideas?
I really can't tell what the problem is. Any ideas?
@albrr have you verified the configs for nodes 172.30.102.57, 172.30.102.39 it seems like it is trying to connect via HTTP. Can you maybe attach the startup logs from one of those nodes?
Which startup logs do you mean? The etcd logs in journalctl basically look the same on the other nodes.
Systemctl status looks like this on .57:
[root@abc ]# systemctl status etcd -l
â etcd.service - etcd key-value store
Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: activating (start) since Thu 2018-09-27 15:36:36 CEST; 58s ago
Docs: https://github.com/coreos/etcd
Main PID: 18162 (etcd)
CGroup: /system.slice/etcd.service
ââ18162 /usr/local/bin/etcd --name etcd2 --data-dir /var/lib/etcd --listen-client-urls https://0.0.0.0:2379 --listen-peer-urls https://0.0.0.0:2380 --advertise-cl ient-urls https://172.30.102.57:2379,https://172.30.102.57:4001 --initial-advertise-peer-urls=https://172.30.102.57:2380 --initial-cluster-token etcd-cluster-1 --initial-clu ster etcd0=https://172.30.102.53:2380,etcd1=https://172.30.102.39:2380,etcd2=https://172.30.102.57:2380 --initial-cluster-state existing --heartbeat-interval 1000 --election -timeout 5000 --peer-client-cert-auth --client-cert-auth --cert-file /etc/ssl/etcd/etcd.pem --peer-cert-file /etc/ssl/etcd/etcd.pem --peer-key-file /etc/ssl/etcd/etcd.key -- key-file=/etc/ssl/etcd/etcd.key --trusted-ca-file=/etc/ssl/etcd/CA.pem --peer-trusted-ca-file=/etc/ssl/etcd/CA.pem
and this on .39:
[root@def ]# systemctl status etcd -l
â etcd.service - etcd key-value store
Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: activating (start) since Thu 2018-09-27 15:39:59 CEST; 19s ago
Docs: https://github.com/coreos/etcd
Main PID: 16223 (etcd)
Tasks: 37
Memory: 21.1M
CGroup: /system.slice/etcd.service
ââ16223 /usr/local/bin/etcd --name etcd1 --data-dir /var/lib/etcd --listen-client-urls https://0.0.0.0:2379 --listen-peer-urls https://0.0.0.0:2380 --advertise-client-urls https://172.30.102.39:2379,https://172.30.102.39:4001 --initial-advertise-peer-urls=https://172.30.102.39:2380 --initial-cluster-token etcd-cluster-1 --initial-cluster etcd0=https://172.30.102.53:2380,etcd1=https://172.30.102.39:2380,etcd2=https://172.30.102.57:2380 --initial-cluster-state existing --heartbeat-interval 1000 --election-timeout 5000 --peer-client-cert-auth --client-cert-auth --cert-file /etc/ssl/etcd/etcd.pem --peer-cert-file /etc/ssl/etcd/etcd.pem --peer-key-file /etc/ssl/etcd/etcd.key --key-file=/etc/ssl/etcd/etcd.key --trusted-ca-file=/etc/ssl/etcd/CA.pem --peer-trusted-ca-file=/etc/ssl/etcd/CA.pem
In /var/log/messages I found the following on .57:
Sep 27 15:43:18 systemd: etcd.service holdoff time over, scheduling restart.
Sep 27 15:43:18 systemd: Starting etcd key-value store...
Sep 27 15:43:18 etcd: etcd Version: 3.3.9
Sep 27 15:43:18 etcd: Git SHA: fca8add78
Sep 27 15:43:18 etcd: Go Version: go1.10.3
Sep 27 15:43:18 etcd: Go OS/Arch: linux/amd64
Sep 27 15:43:18 etcd: setting maximum number of CPUs to 4, total number of available CPUs is 4
Sep 27 15:43:18 etcd: the server is already initialized as member before, starting as etcd member...
Sep 27 15:43:18 etcd: peerTLS: cert = /etc/ssl/etcd/etcd.pem, key = /etc/ssl/etcd/etcd.key, ca = , trusted-ca = /etc/ssl/etcd/CA.pem, client-cert-auth = true, crl-file =
Sep 27 15:43:18 etcd: listening for peers on https://0.0.0.0:2380
Sep 27 15:43:18 etcd: listening for client requests on 0.0.0.0:2379
Sep 27 15:43:18 etcd: name = etcd2
Sep 27 15:43:18 etcd: data dir = /var/lib/etcd
Sep 27 15:43:18 etcd: member dir = /var/lib/etcd/member
Sep 27 15:43:18 etcd: heartbeat = 1000ms
Sep 27 15:43:18 etcd: election = 5000ms
Sep 27 15:43:18 etcd: snapshot count = 100000
Sep 27 15:43:18 etcd: advertise client URLs = https://172.30.102.57:2379,https://172.30.102.57:4001
Sep 27 15:43:18 etcd: restarting member 179451360aef27e2 in cluster 451f234b60b5c18 at commit index 60
Sep 27 15:43:18 etcd: 179451360aef27e2 became follower at term 103688
Sep 27 15:43:18 etcd: newRaft 179451360aef27e2 [peers: [], term: 103688, commit: 60, applied: 0, lastindex: 60, lastterm: 52567]
Sep 27 15:43:18 etcd: simple token is not cryptographically signed
Sep 27 15:43:18 etcd: starting server... [version: 3.3.9, cluster version: to_be_decided]
Sep 27 15:43:18 etcd: added member 179451360aef27e2 [http://172.30.102.57:2380] to cluster 451f234b60b5c18
Sep 27 15:43:18 etcd: added member 4ee5b5f533b5a26e [http://172.30.102.39:2380] to cluster 451f234b60b5c18
Sep 27 15:43:18 etcd: starting peer 4ee5b5f533b5a26e...
Sep 27 15:43:18 etcd: started HTTP pipelining with peer 4ee5b5f533b5a26e
Sep 27 15:43:18 etcd: ClientTLS: cert = /etc/ssl/etcd/etcd.pem, key = /etc/ssl/etcd/etcd.key, ca = , trusted-ca = /etc/ssl/etcd/CA.pem, client-cert-auth = true, crl-file =
Sep 27 15:43:18 etcd: started streaming with peer 4ee5b5f533b5a26e (writer)
Sep 27 15:43:18 etcd: started peer 4ee5b5f533b5a26e
Sep 27 15:43:18 etcd: added peer 4ee5b5f533b5a26e
Sep 27 15:43:18 etcd: added member c891f993c003e831 [http://172.30.102.53:2380] to cluster 451f234b60b5c18
Sep 27 15:43:18 etcd: starting peer c891f993c003e831...
Sep 27 15:43:18 etcd: started HTTP pipelining with peer c891f993c003e831
Sep 27 15:43:18 etcd: started peer c891f993c003e831
Sep 27 15:43:18 etcd: added peer c891f993c003e831
Sep 27 15:43:18 etcd: set the initial cluster version to 3.0
Sep 27 15:43:18 etcd: enabled capabilities for version 3.0
Sep 27 15:43:18 etcd: updated the cluster version from 3.0 to 3.3
Sep 27 15:43:18 etcd: enabled capabilities for version 3.3
Sep 27 15:43:18 etcd: started streaming with peer 4ee5b5f533b5a26e (writer)
Sep 27 15:43:18 etcd: started streaming with peer 4ee5b5f533b5a26e (stream MsgApp v2 reader)
Sep 27 15:43:18 etcd: started streaming with peer 4ee5b5f533b5a26e (stream Message reader)
Sep 27 15:43:18 etcd: started streaming with peer c891f993c003e831 (stream MsgApp v2 reader)
Sep 27 15:43:18 etcd: started streaming with peer c891f993c003e831 (writer)
Sep 27 15:43:18 etcd: started streaming with peer c891f993c003e831 (writer)
Sep 27 15:43:18 etcd: started streaming with peer c891f993c003e831 (stream Message reader)
Sep 27 15:43:18 etcd: rejected connection from "172.30.102.53:39760" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 15:43:18 etcd: rejected connection from "172.30.102.53:39762" (error "tls: first record does not look like a TLS handshake", ServerName "")
Sep 27 15:43:18 etcd: rejected connection from "172.30.102.53:39770" (error "tls: first record does not look like a TLS handshake", ServerName "")
added member 179451360aef27e2 [http://172.30.102.57:2380] to cluster
Seems like the core of the issue was the cluster bootstrapped incorrectly?
What do you mean by "bootstrapped incorrectly" I don't really know how to fix it. All the setup flags are using https addresses, but for some reason http is used
What do you mean by "bootstrapped incorrectly" I don't really know how to fix it. All the setup flags are using https addresses, but for some reason http is used.
Although the configs look right now
Sep 27 15:43:18 etcd: the server is already initialized as member before, starting as etcd member...
This tells me that the cluster was previously bootstrapped and could of had a misconfiguration leading to this issue.
The simple fix is to remove the data-dir for each member and start over if this is a test cluster bootstrap with correct configs. But I am curious what is the output of this ETCDCTL_API=3 etcdctl member list -w table
The w flag didn't work, but I got this:
# ETCDTL_API=3 etcdctl member list
Error: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: connect: connection refused
; error #1: dial tcp 127.0.0.1:2379: connect: connection refused
error #0: dial tcp 127.0.0.1:4001: connect: connection refused
error #1: dial tcp 127.0.0.1:2379: connect: connection refused
I also tried this:
# ETCDTL_API=3 etcdctl --cert-file etcd.pem --key-file etcd.key --ca-file CA.pem --endpoints "https://0.0.0.0:2379" member list
client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint https://0.0.0.0:2379 exceeded header timeout
I deleted the data dir, but now I get this:
Sep 27 16:24:56 etcd[32979]: rejected connection from "172.30.102.57:53612" (error "remote error: tls: bad certificate", ServerName "")
Sep 27 16:24:56 etcd[32979]: rejected connection from "172.30.102.57:53614" (error "remote error: tls: bad certificate", ServerName "")
Sep 27 16:24:56 etcd[32979]: health check for peer 31af7f57a018d679 could not connect: x509: cannot validate certificate for 172.30.102.57 because it doesn't contain any IP SANs
Sep 27 16:24:56 etcd[32979]: health check for peer 5447f25780dc5828 could not connect: dial tcp 172.30.102.39:2380: connect: connection refused
Sep 27 16:24:56 etcd[32979]: rejected connection from "172.30.102.57:53620" (error "remote error: tls: bad certificate", ServerName "")
Sep 27 16:24:56 etcd[32979]: rejected connection from "172.30.102.57:53622" (error "remote error: tls: bad certificate", ServerName "")
Sep 27 16:24:56 etcd[32979]: health check for peer 31af7f57a018d679 could not connect: x509: cannot validate certificate for 172.30.102.57 because it doesn't contain any IP SANs
does SAN have proper IPs?
openssl x509 -in /etc/ssl/etcd/etcd.pem -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
c7:a4:e0:74:4d:46:f3:1d
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=myhost.com
Validity
Not Before: Sep 27 14:47:20 2018 GMT
Not After : Sep 24 14:47:20 2028 GMT
Subject: CN=*.myhost.com
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:ac:01:b2:c0:aa:7b:78:fe:a9:af:e1:fa:59:f7:
c1:ff:f1:0c:fd:ab:0f:94:c4:d2:a1:ec:ec:63:68:
8f:5d:46:31:43:bc:62:44:d4:b3:b8:e1:9b:10:54:
30:49:e8:78:94:6d:93:dc:74:ab:c4:16:10:e3:0a:
6a:fa:3f:d3:7c:74:6e:2e:be:a3:ea:cd:e1:c3:fb:
0d:56:3d:05:b4:a7:d8:15:66:d7:3a:c3:b4:fc:d9:
47:49:04:56:47:9f:4f:1c:5c:8b:d6:c0:e2:00:ee:
f5:58:ac:b2:82:38:d0:26:58:d5:36:52:c3:17:85:
aa:e5:e4:97:29:3d:cf:c6:4b:78:d5:4f:47:cb:05:
24:fe:06:da:9b:20:e8:94:44:19:d8:9b:52:a6:f6:
ba:15:96:3c:61:98:47:85:bd:ea:31:3a:af:14:91:
c9:ca:5c:d8:44:ad:1b:fd:a2:5d:87:57:01:83:b4:
ca:83:5d:8e:55:22:53:c5:4c:01:a0:05:8b:ed:0a:
e2:81:46:16:7f:7e:f8:86:2e:85:57:d2:f8:45:55:
27:d0:2d:94:94:41:d3:7c:b9:bb:8b:20:45:39:74:
ed:1b:08:70:12:4c:ad:a8:18:d3:ce:1a:68:59:24:
f9:1b:b1:94:8e:c8:56:6f:6b:ad:1b:1d:0e:17:e8:
80:5d
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Key Identifier:
96:55:4D:07:D1:63:9D:B7:A5:49:84:F5:18:91:FB:57:F3:C3:44:64
X509v3 Authority Key Identifier:
keyid:95:0D:CC:96:A0:CB:2F:C8:0A:1F:07:12:9A:DE:D9:25:C1:3A:C8:C4
DirName:/CN=myhost.com
serial:D0:CE:3E:FC:36:0D:A8:2A
Signature Algorithm: sha256WithRSAEncryption
53:2a:13:1b:bd:3b:6c:78:a3:4c:bd:f5:4e:6f:c6:23:62:05:
32:26:4c:1d:f3:28:1e:29:64:00:ad:40:7e:e6:95:88:85:df:
11:f3:6b:f5:45:4e:2f:b5:f8:87:db:7d:2f:6f:a4:33:fb:8a:
c6:e8:c8:35:e6:f1:db:02:e5:55:6b:4b:ee:28:18:9f:c0:10:
71:33:c6:6d:6c:21:ba:62:08:54:a9:8b:26:33:6e:69:e7:55:
b5:7a:12:37:69:d6:4c:74:d2:d3:f5:af:99:28:7f:f7:e5:44:
79:29:69:74:2f:e1:d9:30:cb:b2:4a:c7:06:f5:2f:b0:f7:4f:
df:56:20:87:cf:df:f8:75:5d:22:7c:3b:cb:fe:99:6c:7c:ff:
1b:b3:54:a5:73:2d:44:44:81:73:b4:8e:22:cf:98:0a:b9:0d:
5a:0f:81:c3:db:1b:73:01:19:54:be:84:ca:fb:6b:35:a9:fc:
57:27:60:bb:90:81:dd:07:98:1e:24:9c:c8:17:36:a2:62:e3:
98:70:c7:ab:86:0b:5e:4d:66:66:31:50:5e:6a:c3:6e:06:06:
28:04:c5:a2:e3:7f:85:67:e7:98:17:07:25:6b:a8:cf:37:2a:
7d:4e:a9:7a:c6:2e:af:a6:1d:55:bb:c8:4d:76:85:b3:c5:91:
c8:42:5c:f5
Also, I added the IPs and node names as extension
I found the problem, the certificate extensions were not in the certificate (key usage, extended key usage and alternative subject names) - after adding them it worked fine
"Also, I added the IPs and node names as extension" ,how add!
"Also, I added the IPs and node names as extension" ,how add!
I have the same problem!
I found the problem, the certificate extensions were not in the certificate (key usage, extended key usage and alternative subject names) - after adding them it worked fine
Could you show me how to add them in detail? Thanks!
How can I add (key usage, extended key usage and alternative subject names) use cfssl? Could you show me ?
@Issac-ZY
Most helpful comment
Could you show me how to add them in detail? Thanks!