RKE version:
# ./rke --version
rke version v0.1.6
Docker version: (docker version,docker info preferred)
# docker version
Client:
Version: 17.03.2-ce
API version: 1.27
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 03:35:14 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.2-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 03:35:14 2017
OS/Arch: linux/amd64
Experimental: false
# docker info
Containers: 33
Running: 19
Paused: 0
Stopped: 14
Images: 14
Server Version: 17.03.2-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 143
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-116-generic
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.859 GiB
Name: t-cloudmaster01i
ID: SL3S:QEIS:NMDH:2ANI:KAIO:HDKO:BD3L:KYQ3:RJFW:VHAX:4NSF:SIIW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://sproxy01.cardservices.no:3128
Https Proxy: http://sproxy01.cardservices.no:3128
No Proxy: localhost,127.0.0.0/8,*.cardservices.no,*.kontorlan.tag.no
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.4 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.4 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
# uname -r
4.4.0-116-generic
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
VMWare 6.5 Virtual Machine
cluster.yml file:
# cat cluster.yml
# If you intened to deploy Kubernetes in an air-gapped environment,
# please consult the documentation on how to configure custom RKE images.
nodes:
- address: 10.10.7.61
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: root
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.10.7.62
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: root
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
- address: 10.10.7.63
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: root
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
labels: {}
services:
etcd:
image: rancher/coreos-etcd:v3.1.12
extra_args: {}
extra_binds: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: ""
kube-api:
image: rancher/hyperkube:v1.10.1-rancher2
extra_args: {}
extra_binds: []
service_cluster_ip_range: 10.43.0.0/16
pod_security_policy: false
kube-controller:
image: rancher/hyperkube:v1.10.1-rancher2
extra_args: {}
extra_binds: []
cluster_cidr: 10.42.0.0/16
service_cluster_ip_range: 10.43.0.0/16
scheduler:
image: rancher/hyperkube:v1.10.1-rancher2
extra_args: {}
extra_binds: []
kubelet:
image: rancher/hyperkube:v1.10.1-rancher2
extra_args: {}
extra_binds: []
cluster_domain: cluster.local
infra_container_image: rancher/pause-amd64:3.1
cluster_dns_server: 10.43.0.10
fail_swap_on: false
kubeproxy:
image: rancher/hyperkube:v1.10.1-rancher2
extra_args: {}
extra_binds: []
network:
plugin: canal
options: {}
authentication:
strategy: x509
options: {}
sans: []
addons: ""
addons_include: []
system_images:
etcd: ""
alpine: ""
nginx_proxy: ""
cert_downloader: ""
kubernetes_services_sidecar: ""
kubedns: ""
dnsmasq: ""
kubedns_sidecar: ""
kubedns_autoscaler: ""
kubernetes: ""
flannel: ""
flannel_cni: ""
calico_node: ""
calico_cni: ""
calico_controllers: ""
calico_ctl: ""
canal_node: ""
canal_cni: ""
canal_flannel: ""
wave_node: ""
weave_cni: ""
pod_infra_container: ""
ingress: ""
ingress_backend: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
ignore_docker_version: false
kubernetes_version: ""
private_registries: []
ingress:
provider: ""
options: {}
node_selector: {}
extra_args: {}
cluster_name: ""
cloud_provider:
name: ""
cloud_config: {}
prefix_path: ""
Steps to Reproduce:
./rke up
Results:
# ./rke up
INFO[0000] Building Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [10.10.7.61]
INFO[0000] [dialer] Setup tunnel for host [10.10.7.62]
INFO[0000] [dialer] Setup tunnel for host [10.10.7.63]
INFO[0000] [network] Deploying port listener containers
INFO[0000] [network] Successfully started [rke-etcd-port-listener] container on host [10.10.7.62]
INFO[0000] [network] Successfully started [rke-etcd-port-listener] container on host [10.10.7.63]
INFO[0000] [network] Successfully started [rke-etcd-port-listener] container on host [10.10.7.61]
INFO[0001] [network] Successfully started [rke-cp-port-listener] container on host [10.10.7.63]
INFO[0001] [network] Successfully started [rke-cp-port-listener] container on host [10.10.7.62]
INFO[0001] [network] Successfully started [rke-cp-port-listener] container on host [10.10.7.61]
INFO[0001] [network] Port listener containers deployed successfully
INFO[0001] [network] Running etcd <-> etcd port checks
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.62]
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.63]
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.61]
INFO[0001] [network] Running control plane -> etcd port checks
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.62]
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.63]
INFO[0001] [network] Successfully started [rke-port-checker] container on host [10.10.7.61]
INFO[0002] [network] Running control plane -> worker port checks
INFO[0002] [network] Successfully started [rke-port-checker] container on host [10.10.7.63]
INFO[0002] [network] Successfully started [rke-port-checker] container on host [10.10.7.62]
INFO[0002] [network] Successfully started [rke-port-checker] container on host [10.10.7.61]
INFO[0002] [network] Running workers -> control plane port checks
INFO[0002] [network] Checking KubeAPI port Control Plane hosts
INFO[0002] [network] Removing port listener containers
INFO[0002] [remove/rke-etcd-port-listener] Successfully removed container on host [10.10.7.63]
INFO[0002] [remove/rke-etcd-port-listener] Successfully removed container on host [10.10.7.62]
INFO[0002] [remove/rke-etcd-port-listener] Successfully removed container on host [10.10.7.61]
INFO[0002] [remove/rke-cp-port-listener] Successfully removed container on host [10.10.7.62]
INFO[0002] [remove/rke-cp-port-listener] Successfully removed container on host [10.10.7.61]
INFO[0002] [remove/rke-cp-port-listener] Successfully removed container on host [10.10.7.63]
INFO[0002] [network] Port listener containers removed successfully
INFO[0002] [certificates] Attempting to recover certificates from backup on [etcd] hosts
INFO[0003] [certificates] Successfully started [cert-fetcher] container on host [10.10.7.61]
INFO[0003] [certificates] No Certificate backup found on [etcd] hosts
INFO[0003] [certificates] Generating CA kubernetes certificates
INFO[0004] [certificates] Generating Kubernetes API server certificates
INFO[0004] [certificates] Generating Kube Controller certificates
INFO[0005] [certificates] Generating Kube Scheduler certificates
INFO[0005] [certificates] Generating Kube Proxy certificates
INFO[0006] [certificates] Generating Node certificate
INFO[0006] [certificates] Generating admin certificates and kubeconfig
INFO[0006] [certificates] Generating etcd-10.10.7.61 certificate and key
INFO[0007] [certificates] Generating etcd-10.10.7.62 certificate and key
INFO[0007] [certificates] Generating etcd-10.10.7.63 certificate and key
INFO[0007] [certificates] Temporarily saving certs to [etcd] hosts
INFO[0013] [certificates] Saved certs to [etcd] hosts
INFO[0013] [reconcile] Reconciling cluster state
INFO[0013] [reconcile] This is newly generated cluster
INFO[0013] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0018] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0018] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0018] Pre-pulling kubernetes images
INFO[0018] Kubernetes images pulled successfully
INFO[0018] [etcd] Building up etcd plane..
INFO[0018] [etcd] Successfully started [etcd] container on host [10.10.7.61]
INFO[0019] [etcd] Successfully started [rke-log-linker] container on host [10.10.7.61]
INFO[0019] [remove/rke-log-linker] Successfully removed container on host [10.10.7.61]
INFO[0019] [etcd] Successfully started [etcd] container on host [10.10.7.62]
INFO[0019] [etcd] Successfully started [rke-log-linker] container on host [10.10.7.62]
INFO[0019] [remove/rke-log-linker] Successfully removed container on host [10.10.7.62]
INFO[0020] [etcd] Successfully started [etcd] container on host [10.10.7.63]
INFO[0020] [etcd] Successfully started [rke-log-linker] container on host [10.10.7.63]
INFO[0020] [remove/rke-log-linker] Successfully removed container on host [10.10.7.63]
INFO[0020] [etcd] Successfully started etcd plane..
INFO[0020] [controlplane] Building up Controller Plane..
INFO[0020] [controlplane] Successfully started [kube-apiserver] container on host [10.10.7.62]
INFO[0020] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [10.10.7.62]
INFO[0020] [controlplane] Successfully started [kube-apiserver] container on host [10.10.7.63]
INFO[0020] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [10.10.7.63]
INFO[0020] [controlplane] Successfully started [kube-apiserver] container on host [10.10.7.61]
INFO[0020] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [10.10.7.61]
INFO[0032] [healthcheck] service [kube-apiserver] on host [10.10.7.62] is healthy
INFO[0032] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.62]
INFO[0032] [remove/rke-log-linker] Successfully removed container on host [10.10.7.62]
INFO[0032] [healthcheck] service [kube-apiserver] on host [10.10.7.63] is healthy
INFO[0033] [controlplane] Successfully started [kube-controller-manager] container on host [10.10.7.62]
INFO[0033] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [10.10.7.62]
INFO[0033] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.63]
INFO[0033] [healthcheck] service [kube-apiserver] on host [10.10.7.61] is healthy
INFO[0033] [remove/rke-log-linker] Successfully removed container on host [10.10.7.63]
INFO[0033] [controlplane] Successfully started [kube-controller-manager] container on host [10.10.7.63]
INFO[0033] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [10.10.7.63]
INFO[0033] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.61]
INFO[0034] [remove/rke-log-linker] Successfully removed container on host [10.10.7.61]
INFO[0034] [controlplane] Successfully started [kube-controller-manager] container on host [10.10.7.61]
INFO[0034] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [10.10.7.61]
INFO[0038] [healthcheck] service [kube-controller-manager] on host [10.10.7.62] is healthy
INFO[0038] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.62]
INFO[0038] [remove/rke-log-linker] Successfully removed container on host [10.10.7.62]
INFO[0038] [healthcheck] service [kube-controller-manager] on host [10.10.7.63] is healthy
INFO[0039] [controlplane] Successfully started [kube-scheduler] container on host [10.10.7.62]
INFO[0039] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [10.10.7.62]
INFO[0039] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.63]
INFO[0039] [remove/rke-log-linker] Successfully removed container on host [10.10.7.63]
INFO[0039] [controlplane] Successfully started [kube-scheduler] container on host [10.10.7.63]
INFO[0039] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [10.10.7.63]
INFO[0039] [healthcheck] service [kube-controller-manager] on host [10.10.7.61] is healthy
INFO[0040] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.61]
INFO[0040] [remove/rke-log-linker] Successfully removed container on host [10.10.7.61]
INFO[0040] [controlplane] Successfully started [kube-scheduler] container on host [10.10.7.61]
INFO[0040] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [10.10.7.61]
INFO[0044] [healthcheck] service [kube-scheduler] on host [10.10.7.62] is healthy
INFO[0044] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.62]
INFO[0044] [remove/rke-log-linker] Successfully removed container on host [10.10.7.62]
INFO[0044] [healthcheck] service [kube-scheduler] on host [10.10.7.63] is healthy
INFO[0045] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.63]
INFO[0045] [remove/rke-log-linker] Successfully removed container on host [10.10.7.63]
INFO[0045] [healthcheck] service [kube-scheduler] on host [10.10.7.61] is healthy
INFO[0046] [controlplane] Successfully started [rke-log-linker] container on host [10.10.7.61]
INFO[0046] [remove/rke-log-linker] Successfully removed container on host [10.10.7.61]
INFO[0046] [controlplane] Successfully started Controller Plane..
INFO[0046] [authz] Creating rke-job-deployer ServiceAccount
FATA[0071] Failed to apply the ServiceAccount needed for job execution: Post https://10.10.7.61:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings: Forbidden
got the same problem. I did rke remove and then up still the same problem.
rke version v0.1.6
Docker version 1.12.6, build 78d1802
Red Hat Enterprise Linux Server release 7.4 (Maipo)
cluster.yml
nodes:
error from rke
FATA[0069] Failed to apply the ServiceAccount needed for job execution: Post https://myhost:6443/apis/rbac.authorization.k8s.io/v1/cluster
rolebindings: Tunnel or SSL Forbidden
log from kube-apiserver:
I0506 10:20:36.198703 1 logs.go:49] http: TLS handshake error from 10.138.111.110:52822: EOF
using rke version v0.1.5 it works
Case 1) external ETCD with rke v0.1.6 signed certificate for etcd using ip addresses (three ip addresses in 1 cert)
rke 1.6 didn't work
rke log
FATA[0068] [controlPlane] Failed to bring up Control Plane: Failed to verify healthcheck: Service [kube-apiserver] is not healthy on host [ultdkr016]. Response code: [403], response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:anonymous\" cannot get path \"/healthz\"","reason":"Forbidden","details":{},"code":403}
kube-apiserver logs
I0507 04:57:36.191753 1 logs.go:49] http: TLS handshake error from 10.138.111.110:36806: remote error: tls: bad certificate
Case 2) rke v0.1.6 no external ETCD, defined etcd in cluster yml
rke 1.6 didn't work but all three ip addresses added to docker no_proxy also fix the problem.
rke 0.15 seems do not need to add additional ip address or host to docker no_proxy setting...
case 3) rke 0.15 external ETCD
rke log
DEBU[0043] [healthcheck] Service [kube-apiserver] is not healthy on host [ultdkr016]. Response code: [403], response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:anonymous\" cannot get path \"/healthz\"","reason":"Forbidden","details":{},"code":403}
kube-apiserver logs
I0507 05:06:50.692188 1 logs.go:41] http: TLS handshake error from 10.138.145.231:42526: remote error: tls: bad certificate
kube-apiserver logs
http: TLS handshake error from 10.10.7.61:10682: remote error: tls: bad certificate
follow this to sign three certificates instead of one, case 1 and case 3 works ok
https://kubernetes.io/docs/setup/independent/high-availability/
Create etcd CA certs
@danielbjornadal Are you still seeing this issue with v0.1.9 ?
Whats the state of this bug?
Failed for me using: rke version v0.1.11 is it possible that swap not being disabled could be the issue?
Resolved by allowing the bastion/jump server to access the control plane nodes on port 6443
Let us know if this still needs work
Hey @superseb , Maybe add the same somwhere in the wiki page https://rancher.com/docs/rke/v0.1.x/en/config-options/bastion-host/
Since RKE uses ssh to connect to nodes, you can configure to use a bastion host. Keep in mind that the port requirements for the RKE node move to the configured bastion host.
Most helpful comment
Resolved by allowing the bastion/jump server to access the control plane nodes on port 6443