RKE version:
v0.2.1
Docker version: (docker version,docker info preferred)
$ docker info
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 3
Server Version: 18.09.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: 09c8266bf2fcf9519a651b04ae54c967b9ab86ec
init version: v0.18.0 (expected: fec3683b971d9c3ef73f284f176672c44b448662)
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-47-generic
Operating System: Ubuntu 18.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 985.5MiB
Name: rke-node1
ID: V4C5:PQRQ:7AY7:E7QP:NQ7A:MLGF:DUHB:FRAI:DGTK:CWMU:NFGR:JZJW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.2 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.2 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
VirtualBox
cluster.yml file:
none!
Steps to Reproduce:
Ran rke up --local
Results:
Above command fails on etcd healthcheck. Output:
$ rke up --local
INFO[0000] Failed to resolve cluster file, using default cluster instead
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] Successfully Deployed state file at [./cluster.rkestate]
INFO[0000] Building Kubernetes cluster
INFO[0000] [network] Deploying port listener containers
INFO[0000] [network] Successfully started [rke-cp-port-listener] container on host [127.0.0.1]
INFO[0001] [network] Successfully started [rke-worker-port-listener] container on host [127.0.0.1]
INFO[0001] [network] Port listener containers deployed successfully
INFO[0001] [network] Running control plane -> etcd port checks
INFO[0001] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0001] [network] Running control plane -> worker port checks
INFO[0002] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0002] [network] Running workers -> control plane port checks
INFO[0002] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0002] [network] Checking KubeAPI port Control Plane hosts
INFO[0002] [network] Removing port listener containers
INFO[0002] [remove/rke-etcd-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0002] [remove/rke-cp-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0003] [remove/rke-worker-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0003] [network] Port listener containers removed successfully
INFO[0003] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0008] [reconcile] Rebuilding and updating local kube config
INFO[0008] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0008] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0008] [reconcile] Reconciling cluster state
INFO[0008] [reconcile] This is newly generated cluster
INFO[0008] Pre-pulling kubernetes images
INFO[0008] Kubernetes images pulled successfully
INFO[0008] [etcd] Building up etcd plane..
INFO[0008] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [127.0.0.1]
INFO[0008] [remove/etcd-rolling-snapshots] Successfully removed container on host [127.0.0.1]
INFO[0009] [etcd] Successfully started [etcd-rolling-snapshots] container on host [127.0.0.1]
INFO[0014] [certificates] Successfully started [rke-bundle-cert] container on host [127.0.0.1]
INFO[0014] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [127.0.0.1]
INFO[0015] [etcd] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0015] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0015] [etcd] Successfully started etcd plane.. Checking etcd cluster health
FATA[0015] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy
Tested on rke v0.2.2. It doesn't fail on etcd and gets further, but still fails. Latest logs:
$ ./rke-0.2.2 up --local
INFO[0000] Failed to resolve cluster file, using default cluster instead
INFO[0000] Initiating Kubernetes cluster
INFO[0000] [certificates] Generating CA kubernetes certificates
INFO[0000] [certificates] Generating Kubernetes API server aggregation layer requestheader client CA certificates
INFO[0000] [certificates] Generating admin certificates and kubeconfig
INFO[0000] [certificates] Generating Kubernetes API server proxy client certificates
INFO[0000] [certificates] Generating etcd-127.0.0.1 certificate and key
INFO[0001] [certificates] Generating Kube Scheduler certificates
INFO[0001] [certificates] Generating Kube Controller certificates
INFO[0001] [certificates] Generating Kube Proxy certificates
INFO[0001] [certificates] Generating Node certificate
INFO[0001] [certificates] Generating Kubernetes API server certificates
INFO[0001] Successfully Deployed state file at [./cluster.rkestate]
INFO[0001] Building Kubernetes cluster
INFO[0001] [network] Deploying port listener containers
INFO[0001] [network] Pulling image [rancher/rke-tools:v0.1.27] on host [127.0.0.1]
INFO[0009] [network] Successfully pulled image [rancher/rke-tools:v0.1.27] on host [127.0.0.1]
INFO[0010] [network] Successfully started [rke-etcd-port-listener] container on host [127.0.0.1]
INFO[0010] [network] Successfully started [rke-cp-port-listener] container on host [127.0.0.1]
INFO[0011] [network] Successfully started [rke-worker-port-listener] container on host [127.0.0.1]
INFO[0011] [network] Port listener containers deployed successfully
INFO[0011] [network] Running control plane -> etcd port checks
INFO[0011] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0011] [network] Running control plane -> worker port checks
INFO[0012] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0012] [network] Running workers -> control plane port checks
INFO[0013] [network] Successfully started [rke-port-checker] container on host [127.0.0.1]
INFO[0013] [network] Checking KubeAPI port Control Plane hosts
INFO[0013] [network] Removing port listener containers
INFO[0013] [remove/rke-etcd-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [remove/rke-cp-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [remove/rke-worker-port-listener] Successfully removed container on host [127.0.0.1]
INFO[0013] [network] Port listener containers removed successfully
INFO[0013] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0019] [reconcile] Rebuilding and updating local kube config
INFO[0019] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml]
INFO[0019] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0019] [reconcile] Reconciling cluster state
INFO[0019] [reconcile] This is newly generated cluster
INFO[0019] Pre-pulling kubernetes images
INFO[0019] [pre-deploy] Pulling image [rancher/hyperkube:v1.13.5-rancher1] on host [127.0.0.1]
INFO[0056] [pre-deploy] Successfully pulled image [rancher/hyperkube:v1.13.5-rancher1] on host [127.0.0.1]
INFO[0056] Kubernetes images pulled successfully
INFO[0056] [etcd] Building up etcd plane..
INFO[0056] [etcd] Pulling image [rancher/coreos-etcd:v3.2.24-rancher1] on host [127.0.0.1]
INFO[0059] [etcd] Successfully pulled image [rancher/coreos-etcd:v3.2.24-rancher1] on host [127.0.0.1]
INFO[0060] [etcd] Successfully started [etcd] container on host [127.0.0.1]
INFO[0060] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [127.0.0.1]
INFO[0060] [etcd] Successfully started [etcd-rolling-snapshots] container on host [127.0.0.1]
INFO[0066] [certificates] Successfully started [rke-bundle-cert] container on host [127.0.0.1]
INFO[0066] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [127.0.0.1]
INFO[0066] [etcd] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0067] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0067] [etcd] Successfully started etcd plane.. Checking etcd cluster health
INFO[0067] [controlplane] Building up Controller Plane..
INFO[0067] [controlplane] Successfully started [kube-apiserver] container on host [127.0.0.1]
INFO[0067] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [127.0.0.1]
INFO[0079] [healthcheck] service [kube-apiserver] on host [127.0.0.1] is healthy
INFO[0080] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0080] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0081] [controlplane] Successfully started [kube-controller-manager] container on host [127.0.0.1]
INFO[0081] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [127.0.0.1]
INFO[0086] [healthcheck] service [kube-controller-manager] on host [127.0.0.1] is healthy
INFO[0086] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0086] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0086] [controlplane] Successfully started [kube-scheduler] container on host [127.0.0.1]
INFO[0086] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [127.0.0.1]
INFO[0091] [healthcheck] service [kube-scheduler] on host [127.0.0.1] is healthy
INFO[0092] [controlplane] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0092] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0092] [controlplane] Successfully started Controller Plane..
INFO[0092] [authz] Creating rke-job-deployer ServiceAccount
INFO[0092] [authz] rke-job-deployer ServiceAccount created successfully
INFO[0092] [authz] Creating system:node ClusterRoleBinding
INFO[0092] [authz] system:node ClusterRoleBinding created successfully
INFO[0092] Successfully Deployed state file at [./cluster.rkestate]
INFO[0092] [state] Saving full cluster state to Kubernetes
INFO[0092] [state] Successfully Saved full cluster state to Kubernetes ConfigMap: cluster-state
INFO[0092] [worker] Building up Worker Plane..
INFO[0092] [sidekick] Sidekick container already created on host [127.0.0.1]
INFO[0093] [worker] Successfully started [kubelet] container on host [127.0.0.1]
INFO[0093] [healthcheck] Start Healthcheck on service [kubelet] on host [127.0.0.1]
INFO[0098] [healthcheck] service [kubelet] on host [127.0.0.1] is healthy
INFO[0098] [worker] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0099] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0099] [worker] Successfully started [kube-proxy] container on host [127.0.0.1]
INFO[0099] [healthcheck] Start Healthcheck on service [kube-proxy] on host [127.0.0.1]
INFO[0104] [healthcheck] service [kube-proxy] on host [127.0.0.1] is healthy
INFO[0105] [worker] Successfully started [rke-log-linker] container on host [127.0.0.1]
INFO[0105] [remove/rke-log-linker] Successfully removed container on host [127.0.0.1]
INFO[0105] [worker] Successfully started Worker Plane..
INFO[0106] [cleanup] Successfully started [rke-log-cleaner] container on host [127.0.0.1]
INFO[0106] [remove/rke-log-cleaner] Successfully removed container on host [127.0.0.1]
INFO[0106] [sync] Syncing nodes Labels and Taints
INFO[0106] [sync] Successfully synced nodes Labels and Taints
INFO[0106] [network] Setting up network plugin: canal
INFO[0106] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0106] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0106] [addons] Executing deploy job rke-network-plugin
FATA[0136] Failed to get job complete status for job rke-network-plugin-deploy-job in namespace kube-system
I'll change the issue title to make it more general...
Have you fixed this? I'm having the same issue using v0.3.2.
Most helpful comment
Tested on rke v0.2.2. It doesn't fail on etcd and gets further, but still fails. Latest logs:
I'll change the issue title to make it more general...