RKE version:
0.2.2 and 0.1,17 - started out with 0.2.2, failed, saw some people had working with 0.1.17 so tried that aswell
Docker version: (docker version,docker info preferred)
Containers: 5
Running: 3
Paused: 0
Stopped: 2
Images: 3
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: runc docker-runc
Default Runtime: docker-runc
Init Binary: /usr/libexec/docker/docker-init-current
containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: df5c38a9167e87f53a9894d77c0950e178a745e7 (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
WARNING: You're not using the default seccomp profile
Profile: /etc/docker/seccomp.json
Kernel Version: 3.10.0-957.12.2.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 4
Total Memory: 15.51 GiB
Name: xxx
ID: xxx
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://registry.access.redhat.com/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Registries: registry.access.redhat.com (secure), docker.io (secure), registry.fedoraproject.org (secure), quay.io (secure), registry.centos.org (secure), docker.io (secure)
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.12.2.el7.x86_64
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
vmware virtual machine
cluster.yml file:
nodes:
*server names changed from real
services:
etcd:
snapshot: true
creation: 6h
retention: 24h
Steps to Reproduce:
build 3 fresh rhel7.6 servers, follow setup instructions per Rancher HA install
note: between attempts the machines were entirely rebuilt, config applied from scratch
rke up
Results:
on 0.1.17
FATA[0259] [controlPlane] Failed to bring up Control Plane: Failed to verify healthcheck: Failed to check https://localhost:6443/healthz for service [kube-apiserver] on host [xxx]: Get https://localhost:6443/healthz: EOF, log:
on 0.2.2
FATA[0212] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy
in both cases this error is in the etcd container logs;
"transport: authentication handshake failed: remote error: tls: bad certificate"
running openssl on the certs;
openssl x509 -in kube-etcd-XXX.pem -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 1797566144391285364 (0x18f23ce268b60a74)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=kube-ca
Validity
Not Before: May 31 12:21:33 2019 GMT
Not After : May 30 12:21:34 2020 GMT
Subject: CN=kube-etcd
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
00:cc:02:a6:65:dc:c4:e5:1c:4a:8f:27:79:b9:d1:
46:42:95:6d:64:28:8b:6e:c8:72:25:bc:21:e9:0f:
45:c2:66:ab:d7:3d:c4:67:0e:55:46:4a:f6:7b:19:
6d:aa:35:0d:f6:9c:ad:b8:8f:2d:c5:e6:07:b8:52:
b7:7a:ff:b5:a5:4e:4a:98:fa:b7:c3:20:ea:fd:09:
04:19:d9:7b:4f:17:a0:05:21:96:3f:eb:24:10:53:
cf:59:99:80:f9:59:be:f9:45:51:fb:7f:12:c8:7d:
ae:c7:30:64:a0:13:32:62:cc:c4:98:72:5e:66:66:
bf:66:da:d0:5f:cb:4f:ef:59:f8:4f:37:9e:42:75:
99:55:2c:cc:2c:15:ff:b5:d9:97:cd:93:d9:e6:55:
44:d8:ac:03:3a:ce:28:c4:2a:af:93:96:58:cb:18:
a8:dd:8f:5e:ee:80:8d:b9:ed:cc:01:25:53:bd:96:
80:6e:69:20:4c:5a:66:0a:b2:5b:c5:6d:3e:72:2a:
e4:68:b8:db:b5:ed:27:24:c8:5c:d3:07:ba:a9:d8:
1d:f4:20:c2:05:97:02:05:85:b7:02:52:fa:69:3b:
38:4d:3e:c3:51:ee:53:31:a0:19:6b:38:00:34:b1:
ee:06:d6:36:88:86:6b:e8:48:f7:d5:d7:17:94:03:
4a:79
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Client Authentication, TLS Web Server Authentication
X509v3 Subject Alternative Name:
DNS:xxx, DNS:localhost, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:10.43.0.1
Signature Algorithm: sha256WithRSAEncryption
25:ba:46:e6:36:2d:74:97:95:ca:9c:e0:bd:92:44:f2:a5:8f:
f6:45:94:52:77:d7:ae:39:5b:3e:95:30:90:f1:bd:07:4d:db:
16:44:fc:4d:44:b6:33:c9:0e:72:27:65:33:4b:57:82:e3:1e:
94:6d:5e:65:90:a4:f4:0b:18:5f:39:82:b8:5f:d5:c7:6a:00:
bc:02:25:0f:68:5e:19:cd:65:83:f8:d0:6f:a6:d3:06:3a:f0:
75:1b:81:4c:36:ff:1a:91:2e:d6:e5:35:94:7c:3c:cd:f7:04:
e4:32:d4:d1:56:57:c1:39:e0:93:f9:9a:69:36:8d:39:60:b2:
0f:ef:b2:73:61:8e:66:45:39:a4:91:1b:6e:df:73:04:60:36:
5f:b5:19:a4:32:1b:1c:62:07:e8:b6:24:5d:68:7c:a2:57:6e:
e4:0d:d5:2a:1c:92:6a:93:4b:60:6a:3e:39:40:5b:56:0a:80:
59:7b:d3:6e:50:b4:bf:ff:5d:f0:36:0d:93:a0:6c:c5:a2:4f:
4c:3b:4e:fe:30:44:14:d8:d5:63:0d:54:c5:66:36:33:06:ae:
88:c4:c2:81:97:8b:c2:63:f8:ef:9d:f2:35:11:55:73:92:47:
ad:0b:11:c6:06:31:9b:ba:95:36:62:80:89:1c:ef:ea:87:f8:
9f:66:42:2a
Ok, managed to fix this. For anyone else as silly as me, if you run openssl against your etcd url, it will tell you what is bad about the certs.
openssl s_client -showcerts -connect your_servername:2379
In my case, it was because the machine I was running rke froms time was 5 hours out and was creating certificates valid for 5 hours time, meaning they were 8 hours out from reality.
Thanks Google for landing me here!
rke cert rotate Did the job for me.
Most helpful comment
Ok, managed to fix this. For anyone else as silly as me, if you run openssl against your etcd url, it will tell you what is bad about the certs.
openssl s_client -showcerts -connect your_servername:2379
In my case, it was because the machine I was running rke froms time was 5 hours out and was creating certificates valid for 5 hours time, meaning they were 8 hours out from reality.