Minikube: Minikube commands sometimes flakes out

Created on 16 Aug 2016  路  15Comments  路  Source: kubernetes/minikube

I've had this happen several times to me:

$ minikube ip
E0816 14:30:32.846687    7922 ip.go:45] Error getting IP:  Something went wrong running an SSH command!
command : ip addr show
err     : exit status 255
output  :

Stopping and restarting Minikube fixes the issue. Kubectl works fine when this happens.

kinbug

Most helpful comment

I get similar errors when using minikube start after minikube stop and my laptop going to sleep. I'm using osx with docker-machine-driver-xhyve and minikube v0.19.0.

It looks like https://github.com/kubernetes/minikube/issues/1452 and https://github.com/kubernetes/minikube/issues/1400 are tracking issues similar to mine.

All 15 comments

What driver are you using?

Could you attach the output of:

VBoxManage showvminfo minikubeVM | grep port
and
cat ~/.minikube/machines/minikubeVM/config.json | grep Port

I'm using VirtualBox.

VBoxManage showvminfo minikubeVM | grep port:

Teleporter Enabled: off
Teleporter Port: 0
Teleporter Address:
Teleporter Password:
NIC 1:           MAC: 080027A2402E, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Rule(0):   name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 52695, guest ip = , guest port = 22
NIC 2:           MAC: 080027897461, Attachment: Host-only Interface 'vboxnet1', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none

cat ~/.minikube/machines/minikubeVM/config.json | grep Port

        "SSHPort": 52695,

By the way, I had it happen to me again today. I plan to update to 0.8 and will report back if it continues to happen (unless you need me to stay on my current version for debugging purposes).

Feel free to upgrade, I'd be happy if it goes away with 0.8 but don't think we touched anything that might have fixed it.

When using virtualbox, the IP command uses ssh over localhost:$PORT to run a command to figure out the correct host-only IP (ip addr show).

It looks like that SSH command isn't working for some reason. Minikube configures virtualbox to forward a port from localhost outside the VM to port 22 on localhost inside the VM. Minikube then stores the external port.

It looks like they both match here though: 52695. So I'm not sure what else could be messed up. It sounds like kubectl can still access the VM, so the VM itself is still running. Maybe sshd crashed inside it?

It happened again with 0.8, though this time kubectl didn't work either:

$ kubectl get pod
Unable to connect to the server: net/http: TLS handshake timeout

The VirtualBox process is at 100% CPU usage.

Also, this time minikube stop hung and couldn't stop the VM. I had to manually kill the VirtualBox process, at which point minikube start worked.

I updated my VirtualBox version to the latest, but that hasn't fixed the issue.

Latest flake out:

minikube ip doesn't work, as above. Kubectl doesn't work, but with a different error:

$ kubectl get pod
The connection to the server 192.168.99.100:8443 was refused - did you specify the right host or port?

VirtualBox process CPU is not high. minikube stop stops the cluster as expected and minikube start brings the cluster back into a working state.

I find it confusing that the error behaviour is so inconsistent (I've seen at least 3 different variations so far). The only consistent part is that minikube ip (or any minikube SSH command) doesn't work and restarting the cluster fixes the issue.

I've narrowed down the problem to a bad container. When I run Minikube without this container, it never crashes. When I deploy the container to my cluster Minikube crashes within a day or so. Closing this issue as it doesn't seem to be a Minikube problem.

I'd be interested in knowing what the root cause is, is the container something you can share?

It's just an HAProxy container that exposes a few of our services (so we can have a single load balancer instead of one per service). It doesn't seem to be do anything crazy, so I'm not sure what's going on.

We're moving to Ingress anyway so this container is being deprecated (that's how I discovered that this was the issue; when running just Ingress Minikube stopped crashing for a week, when I redeployed the proxy container Minikube crashed the next day).

I'd be curious to know the root cause as well, but given that it takes a day or two to see Minikube crash it will be annoying to come up with a minimum reproducible setup I can give you (I might try briefly though).

I am also facing this issue. Any help would be greatly appreciated

minikube version: v0.14.0

(minikube) DBG | Not there yet 14/60, error: IP not found for MAC ae:3b:d:de:84:b6 in DHCP leases
(minikube) DBG | 192.168.64.10
(minikube) DBG | IP found in DHCP lease table: 192.168.64.10
(minikube) DBG | Got an ip: 192.168.64.10
(minikube) DBG | Getting to WaitForSSH function...
(minikube) DBG | Using SSH client type: external
(minikube) DBG | Using SSH private key: /Users/syali/.minikube/machines/minikube/id_rsa (-rw-------)
(minikube) DBG | &{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none [email protected] -o IdentitiesOnly=yes -i /Users/syali/.minikube/machines/minikube/id_rsa -p 22] /usr/bin/ssh }
(minikube) DBG | About to run SSH command:
(minikube) DBG | exit 0
(minikube) DBG | SSH cmd err, output: exit status 255:
(minikube) DBG | Error getting ssh command 'exit 0' : Something went wrong running an SSH command!
(minikube) DBG | command : exit 0
(minikube) DBG | err : exit status 255
(minikube) DBG | output :
(minikube) DBG |

We finnaly resolved it in our case. It turned out, that we had container which used a lot of Cpu for no reason. We fixed the problem with that container and minikube ssh never failed again. Probably there should be an option to set longer timeouts so commands would work even if there's higher resource usage.

I get the same issue with minishift as well as minikube

(minishift) DBG | Getting to WaitForSSH function...
(minishift) DBG | Using SSH client type: external
(minishift) DBG | Using SSH private key: /Users/syali/.minishift/machines/minishift/id_rsa (-rw-------)
(minishift) DBG | &{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none [email protected] -o IdentitiesOnly=yes -i /Users/syali/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh }
(minishift) DBG | About to run SSH command:
(minishift) DBG | exit 0
(minishift) DBG | SSH cmd err, output: exit status 255:
(minishift) DBG | Error getting ssh command 'exit 0' : Something went wrong running an SSH command!
(minishift) DBG | command : exit 0
(minishift) DBG | err : exit status 255
(minishift) DBG | output :
(minishift) DBG |
(minishift) DBG | Getting to WaitForSSH function...
(minishift) DBG | Using SSH client type: external
(minishift) DBG | Using SSH private key: /Users/syali/.minishift/machines/minishift/id_rsa (-rw-------)
(minishift) DBG | &{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none [email protected] -o IdentitiesOnly=yes -i /Users/syali/.minishift/machines/minishift/id_rsa -p 22] /usr/bin/ssh }
(minishift) DBG | About to run SSH command:
(minishift) DBG | exit 0

I also had this issue when attempting to complete the tutorial at https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address-service/

I had to force quit the docker-machine-driver-xhyve process, else minikube hung.

I get similar errors when using minikube start after minikube stop and my laptop going to sleep. I'm using osx with docker-machine-driver-xhyve and minikube v0.19.0.

It looks like https://github.com/kubernetes/minikube/issues/1452 and https://github.com/kubernetes/minikube/issues/1400 are tracking issues similar to mine.

Was this page helpful?
0 / 5 - 0 ratings