Minikube: Kubernetes not working after hard shutdown

Created on 27 Jul 2016  路  9Comments  路  Source: kubernetes/minikube

I just upgraded to minikube 0.7.0 from 0.6.0. I'm on OS X and I installed using the OS X installation instructions. To upgrade, I ran the OS X curl installation instructions. My minikube machine was running at the time I attempted my upgrade.

After upgrade, I see:

[philip@laptop k8]$ minikube version
minikube version: v0.7.0
[philip@laptop k8]$ minikube start
Starting local Kubernetes cluster...
Kubernetes is available at https://192.168.99.100:8443.
Kubectl is now configured to use the cluster.
[philip@laptop k8]$ minikube version
minikube version: v0.7.0
[philip@laptop k8]$ minikube status
Running
[philip@laptop k8]$ minikube get-k8s-versions
The following Kubernetes versions are available: 
    - v1.3.0
[philip@laptop k8]$ kubectl get pods
The connection to the server 192.168.99.100:8443 was refused - did you specify the right host or port?
[philip@laptop k8]$ kubectl config current-context
minikube
[philip@laptop k8]$ cat ~/.kube/config 
apiVersion: v1
clusters:
- cluster:
    certificate-authority: /Users/philip/.minikube/ca.crt
    server: https://192.168.99.100:8443
  name: minikube
contexts:
- context:
    cluster: minikube
    user: minikube
  name: minikube
current-context: minikube
kind: Config
preferences: {}
users:
- name: minikube
  user:
    client-certificate: /Users/philip/.minikube/apiserver.crt
    client-key: /Users/philip/.minikube/apiserver.key
[philip@laptop k8]$ 
[philip@laptop k8]$ minikube ssh 
                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""\___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/
 _                 _   ____     _            _
| |__   ___   ___ | |_|___ \ __| | ___   ___| | _____ _ __
| '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
| |_) | (_) | (_) | |_ / __/ (_| | (_) | (__|   <  __/ |
|_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
Boot2Docker version 1.11.1, build master : 901340f - Fri Jul  1 22:52:19 UTC 2016
Docker version 1.11.1, build 5604cbe
docker@minikubeVM:~$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
docker@minikubeVM:~$ 

My workaround was to run minikube delete; minikube start and then re-create my kubernetes cluster.

kinbug

Most helpful comment

@philipn you can continue to use new versions of minikube while pinning a kubernetes version

minikube config set kubernetes-version v1.4.5

All 9 comments

Hmm, I just tried and wasn't able to repro this. Did you happen to run "minikube logs" before the delete?

@dlorenc It may be unrelated to the upgrade procedure, as I just had this happen again with minikube.

Here is the output of minikube logs:

docker@minikubeVM:~$ [philip@laptop shotwell]$ minikube logs
==> /var/lib/localkube/localkube.err <==
Starting etcd...

==> /var/lib/localkube/localkube.out <==
I0727 03:53:36.145632    1573 server.go:202] Using iptables Proxier.
I0727 03:53:36.145753    1573 server.go:215] Tearing down userspace rules.
E0727 03:53:36.268708    1573 reflector.go:205] pkg/proxy/config/api.go:30: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E0727 03:53:36.268844    1573 reflector.go:205] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: Get http://127.0.0.1:8080/api/v1/endpoints?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
2016-07-27 03:53:36.401476 I | etcdserver: recovered store from snapshot at index 20002
2016-07-27 03:53:36.401529 I | etcdserver: name = kubeetcd
2016-07-27 03:53:36.401540 I | etcdserver: data dir = /var/lib/localkube/etcd
2016-07-27 03:53:36.401552 I | etcdserver: member dir = /var/lib/localkube/etcd/member
2016-07-27 03:53:36.401562 I | etcdserver: heartbeat = 100ms
2016-07-27 03:53:36.401571 I | etcdserver: election = 1000ms
2016-07-27 03:53:36.401580 I | etcdserver: snapshot count = 10000
2016-07-27 03:53:36.401603 I | etcdserver: advertise client URLs = http://localhost:2379
2016-07-27 03:53:36.622914 C | etcdserver: read wal error (walpb: crc mismatch) and cannot be repaired

I had previously hard-shutdown my Mac, so my guess is this caused irreparable corruption. I wasn't expecting kubernetes (via etcd) to be corrupted so easily, though. Is it expected that a single-node kubernetes cluster (e.g. minikube) cannot withstand a hard shutdown?

This may be related to https://github.com/coreos/etcd/issues/5857 and https://github.com/coreos/etcd/pull/5862, but I don't know enough about etcd to say -- if anyone knows better, please chime in!

I am able to reproduce this by having a misconfigured 3rd party resource in my kubernetes cluster. My only solution to this is to delete/reistall minikube.

from minikube logs:

016-10-31 20:23:15.174411 I | etcdserver: recovered store from snapshot at index 590059
2016-10-31 20:23:15.174457 I | etcdserver: name = dns
2016-10-31 20:23:15.174466 I | etcdserver: data dir = /var/lib/localkube/dns
2016-10-31 20:23:15.174475 I | etcdserver: member dir = /var/lib/localkube/dns/member
2016-10-31 20:23:15.174481 I | etcdserver: heartbeat = 100ms
2016-10-31 20:23:15.174487 I | etcdserver: election = 1000ms
2016-10-31 20:23:15.174493 I | etcdserver: snapshot count = 10000
2016-10-31 20:23:15.174506 I | etcdserver: advertise client URLs = http://localhost:49090
panic: runtime error: index out of range

goroutine 191 [running]:
panic(0x30e4aa0, 0xc420010110)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.(_Master).InstallThirdPartyResource(0xc420077800, 0xc421d1fe00, 0xc422143d00, 0x0)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/master.go:694 +0xa86
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.(_ThirdPartyController).SyncOneResource(0xc421e65f20, 0xc421d1fe00, 0x1, 0x1)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/thirdparty_controller.go:90 +0x79
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.(_ThirdPartyController).syncResourceList(0xc421e65f20, 0x5243b40, 0xc4222cf680, 0x0, 0x5243b40)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/thirdparty_controller.go:119 +0x1b7
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.(_ThirdPartyController).SyncResources(0xc421e65f20, 0xc421f1d678, 0x406c64)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/thirdparty_controller.go:101 +0x88
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.ExtensionsRESTStorageProvider.v1beta1Storage.func1()
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/storage_extensions.go:82 +0x30
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait.JitterUntil.func1(0xc42050b050)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go:84 +0x19
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait.JitterUntil(0xc42050b050, 0x2540be400, 0x0, 0x30312d3631303201, 0xc420053200)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go:85 +0xad
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait.Until(0xc42050b050, 0x2540be400, 0xc420053200)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go:47 +0x4d
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait.Forever(0xc42050b050, 0x2540be400)
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/util/wait/wait.go:39 +0x41
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master.ExtensionsRESTStorageProvider.v1beta1Storage
/var/lib/jenkins/go2/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/master/storage_extensions.go:85 +0x1dd5

==> /var/lib/localkube/localkube.out <==
Starting etcd...
Starting apiserver...
Starting controller-manager...
Starting scheduler...
Starting kubelet...
Starting proxy...
Starting dns...

minikube version
minikube version: v0.12.0

I got the same issue, but I did not hard shut down my OS. I'm running it on OS X, after I start the cluster use "minikube start", I got this:

$ minikube logs
==> /var/lib/localkube/localkube.err <==
I1202 01:38:21.528534 2941 server.go:203] Using iptables Proxier.
W1202 01:38:21.529000 2941 server.go:426] Failed to retrieve node info: Get http://127.0.0.1:8080/api/v1/nodes/minikube: dial tcp 127.0.0.1:8080: getsockopt: connection refused
W1202 01:38:21.529063 2941 proxier.go:226] invalid nodeIP, initialize kube-proxy with 127.0.0.1 as nodeIP
I1202 01:38:21.529076 2941 server.go:215] Tearing down userspace rules.
E1202 01:38:21.550651 2941 reflector.go:203] pkg/proxy/config/api.go:30: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
2016-12-02 01:38:21.550707 I | etcdserver: name = kubeetcd
2016-12-02 01:38:21.550726 I | etcdserver: data dir = /var/lib/localkube/etcd
2016-12-02 01:38:21.550733 I | etcdserver: member dir = /var/lib/localkube/etcd/member
2016-12-02 01:38:21.550739 I | etcdserver: heartbeat = 100ms
2016-12-02 01:38:21.550744 I | etcdserver: election = 1000ms
2016-12-02 01:38:21.550749 I | etcdserver: snapshot count = 10000
2016-12-02 01:38:21.550761 I | etcdserver: advertise client URLs = http://localhost:2379
2016-12-02 01:38:21.550769 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2016-12-02 01:38:21.550782 I | etcdserver: initial cluster = kubeetcd=http://localhost:2380
E1202 01:38:21.555261 2941 reflector.go:203] pkg/proxy/config/api.go:33: Failed to list *api.Endpoints: Get http://127.0.0.1:8080/api/v1/endpoints?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
2016-12-02 01:38:21.556434 I | etcdserver: starting member 37807cb0bf7500f6 in cluster 2c833ae9c7555b5e
2016-12-02 01:38:21.556483 I | raft: 37807cb0bf7500f6 became follower at term 0
2016-12-02 01:38:21.556494 I | raft: newRaft 37807cb0bf7500f6 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2016-12-02 01:38:21.556500 I | raft: 37807cb0bf7500f6 became follower at term 1
2016-12-02 01:38:21.560478 I | etcdserver: starting server... [version: 3.0.6, cluster version: to_be_decided]
2016-12-02 01:38:21.564860 I | membership: added member 37807cb0bf7500f6 [http://localhost:2380] to cluster 2c833ae9c7555b5e
E1202 01:38:21.564900 2941 server.go:75] unable to register configz: register config "componentconfig" twice
E1202 01:38:21.565649 2941 controllermanager.go:125] unable to register configz: register config "componentconfig" twice
E1202 01:38:21.567589 2941 leaderelection.go:252] error retrieving endpoint: Get http://127.0.0.1:8080/api/v1/namespaces/kube-system/endpoints/kube-scheduler: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567639 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:414: Failed to list *extensions.ReplicaSet: Get http://127.0.0.1:8080/apis/extensions/v1beta1/replicasets?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567667 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:409: Failed to list *api.ReplicationController: Get http://127.0.0.1:8080/api/v1/replicationcontrollers?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567692 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:404: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567715 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:399: Failed to list *api.PersistentVolumeClaim: Get http://127.0.0.1:8080/api/v1/persistentvolumeclaims?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567739 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:398: Failed to list *api.PersistentVolume: Get http://127.0.0.1:8080/api/v1/persistentvolumes?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567770 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:394: Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567798 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:391: Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%21%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567825 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:388: Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D%2Cstatus.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567851 2941 leaderelection.go:252] error retrieving endpoint: Get http://127.0.0.1:8080/api/v1/namespaces/kube-system/endpoints/kube-controller-manager: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.567869 2941 server.go:294] unable to register configz: register config "componentconfig" twice
W1202 01:38:21.567897 2941 server.go:549] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig: no such file or directory. Using default client config instead.
I1202 01:38:21.570777 2941 genericapiserver.go:629] Will report 10.0.2.15 as public IP address.
W1202 01:38:21.581167 2941 cacher.go:469] Terminating all watchers from cacher *api.ResourceQuota
I1202 01:38:21.581821 2941 conntrack.go:40] Setting nf_conntrack_max to 131072
I1202 01:38:21.582400 2941 conntrack.go:57] Setting conntrack hashsize to 32768
I1202 01:38:21.582753 2941 conntrack.go:62] Setting nf_conntrack_tcp_timeout_established to 86400
W1202 01:38:21.584116 2941 cacher.go:469] Terminating all watchers from cacher *api.PodTemplate
W1202 01:38:21.584387 2941 cacher.go:469] Terminating all watchers from cacher *api.LimitRange
E1202 01:38:21.584948 2941 event.go:208] Unable to write event: 'Post http://127.0.0.1:8080/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8080: getsockopt: connection refused' (may retry after sleeping)
E1202 01:38:21.585101 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list *api.ResourceQuota: Get http://127.0.0.1:8080/api/v1/resourcequotas?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.585172 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/admission/storageclass/default/admission.go:74: Failed to list *storage.StorageClass: Get http://127.0.0.1:8080/apis/storage.k8s.io/v1beta1/storageclasses?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.585243 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list *api.Secret: Get http://127.0.0.1:8080/api/v1/secrets?fieldSelector=type%3Dkubernetes.io%2Fservice-account-token&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.585683 2941 reflector.go:214] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list *api.ServiceAccount: Get http://127.0.0.1:8080/api/v1/serviceaccounts?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
E1202 01:38:21.585761 2941 reflector.go:203] k8s.io/minikube/vendor/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:154: Failed to list *api.LimitRange: Get http://127.0.0.1:8080/api/v1/limitranges?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
....
....

@philipn It seems my api server:8080 is not started correctly, like your 8443. Did you get any progress on this?

@cuiyz It hasn't happened to me since my reports earlier. I'm still on the same minikube version as when reported. I wasn't able to reproduce similar corruption behavior in repeated hard (partial and full) cluster shutdown tests of our production K8s cluster, so I didn't investigate this further. I'll comment here again if I see this happen again on a newer minikube release (I'm on v0.7.0 for parity with our production K8s).

@philipn you can continue to use new versions of minikube while pinning a kubernetes version

minikube config set kubernetes-version v1.4.5

This should be fixed in 0.19.1

Was this page helpful?
0 / 5 - 0 ratings