Minikube: fatal error: concurrent map read and map write

Created on 21 Jul 2016  路  26Comments  路  Source: kubernetes/minikube

running minikube logs shows the following:

fatal error: concurrent map read and map write

goroutine 274604 [running]:
runtime.throw(0x3ce8ca0, 0x21)
    /usr/lib/google-golang/src/runtime/panic.go:550 +0x99 fp=0xc8246efaf8 sp=0xc8246efae0 pc=0x42f639
runtime.mapaccess1_faststr(0x291e0a0, 0xc82225a840, 0xc820d91fb0, 0x25, 0x589a000)
    /usr/lib/google-golang/src/runtime/hashmap_fast.go:202 +0x5b fp=0xc8246efb58 sp=0xc8246efaf8 pc=0x40e18b
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).queueActionLocked(0xc821fe04d0, 0x38c58c8, 0x4, 0x3867a00, 0xc82752d700, 0x0, 0x0)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:303 +0x2e2 fp=0xc8246efcc0             sp=0xc8246efb58 pc=0x110f8e2
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).Resync(0xc821fe04d0, 0x0, 0x0)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:498 +0x4f7 fp=0xc8246efe38             sp=0xc8246efcc0 pc=0x1112027
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc821547290, 0xc82006cf00, 0xc8219ae780, 0xc8232bd020, 0xc821547298)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:289 +0x252 fp=0xc8246eff88              sp=0xc8246efe38 pc=0x1127a72
runtime.goexit()
    /usr/lib/google-golang/src/runtime/asm_amd64.s:2002 +0x1 fp=0xc8246eff90 sp=0xc8246eff88 pc=0x464961
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

Steps to reproduce: Unknown at this point, but the number of occurrences has been 3 in the past 2 days.

minikube version: v0.6.0

kubectl version Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"283137936a498aed572ee22af6774b6fb6e9fd94", GitTreeState:"clean", BuildDate:"2016-07-01T19:26:38Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"darwin/amd64"} The connection to the server 192.168.99.100:8443 was refused - did you specify the right host or port? (minikube vm is no longer responsive to kubectl comands)

OS: Darwin Kernel 15.3.0

kinbug

Most helpful comment

Cool, we'll publish one soon. You won't need 0.8.1 for that, you'll be able to use it with a --k8s_version flag and your existing build.

All 26 comments

There was a watch related bug fixed in 1.3.1, maybe once we update in #380.

As an FYI, after the failure and restart with: minikube stop; minikube start previously running pods start up fine, but no longer show events from kubectl describe pods. To see the events, one needs to delete the running pod deployment and then recreate it.

@dlorenc Thanks for the info.

I looked at the watch fix in 1.3.1 / 1.3.2 but i don't think it addresses the problem.

Now that I have a few more stack dumps to look at, my guess is that there is a race condition that the following PR might address:

https://github.com/kubernetes/kubernetes/pull/28744

I tried out the latest release minikube v0.7.0, but the problem persists.

fatal error: concurrent map read and map write

goroutine 460377 [running]:
runtime.throw(0x3ceee60, 0x21)
    /usr/local/go/src/runtime/panic.go:547 +0x90 fp=0xc824f57ae8 sp=0xc824f57ad0
runtime.mapaccess2_faststr(0x2924c00, 0xc821629b60, 0xc824b56830, 0xc, 0x1, 0x1)
    /usr/local/go/src/runtime/hashmap_fast.go:307 +0x5b fp=0xc824f57b48 sp=0xc824f57ae8
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).queueActionLocked(0xc82123abb0, 0x38cb990, 0x4, 0x384dba0, 0xc8245bb000, 0x0, 0x0)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:309 +0x4dc fp=0xc824f57cb0 sp=0xc824f57b48
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).Resync(0xc82123abb0, 0x0, 0x0)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:498 +0x4f7 fp=0xc824f57e28 sp=0xc824f57cb0
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc820026468, 0xc8215d6900, 0xc82273ca00, 0xc8248234a0, 0xc820026478)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:289 +0x252 fp=0xc824f57f78 sp=0xc824f57e28
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc824f57f80 sp=0xc824f57f78
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

Next week I will try and cherry pick kubernetes/kubernetes#28744 and see if that helps.

I had a bit of time to try and work this issue, but I think I may be missing some fundamental steps in debugging.

I cloned the minikube repo, and cherry picked the relevant code I wanted to try.
Then make was issued; and out/minikube was created.

I currently have the .7.1 (release) minikube-vm installed--- does it need to be deleted before I run out/minikube stop; out/minikube start? Or is a simple out/minikube stop/start ok?

I am trying not to delete the current VM because my internet speed is slow... does minikube always work with the same eval $(minikube docker-env) settings?

Sorry for the slow response! Looks like this has finally been cherrypicked, but it didn't make it into 1.3.5: https://github.com/kubernetes/kubernetes/issues/29960

We'll be able to pull in the fix once this makes it to an official Kubernetes release.

It looks like this is caused by having deployments with overlapping selectors, so removing that might be a workaround for now: https://github.com/kubernetes/kubernetes/issues/29960#issuecomment-237390158

You can just re-run "minikube start" to update the VM/localkube binary after you recompile. You don't need to stop or delete.

We should include those in our offline section: #391

@andrewgdavis just curious, did you have multiple deployments? Do those deployments have overlapping label selectors?

at one time, that may have been the case, but last week I have only been deploying 1 pod that makes use of pod.alpha.kubernetes.io/init-containers and different types of persistent volumes (and fsGroups) backing the containers. Things don't seem to work as advertised :/

The yaml looks like this.

kind: Pod
apiVersion: v1
metadata:
  name: hot
  labels:
    app: hot
  annotations:
    pod.alpha.kubernetes.io/init-containers:
  ...
  ... a couple of init-containers doing stuff to prep tomcat
spec:
  containers:
  - name: run
    image: mytomcat:8.0.36
    imagePullPolicy: "IfNotPresent"
    volumeMounts:
    - name: workdir
      mountPath: /usr/share/tomcat/webapps/
  volumes:
    - name: workdir
      emptyDir:
        medium: "Memory"

the fatal error: concurrent map writes still occurs when running it. (simple kubectl delete -f tomcat.yaml and then kubectl create -f tomcat.yaml)

update: to be honest the last time that this occurred was Aug 9th-- I may have been doing deployments along side this pod at the time.
I will keep a note when I have multiple deployments started.

@dlorenc this looks different from https://github.com/kubernetes/kubernetes/issues/29960, since the panic comes from client cache deltafifo, instead of deployment controller?

The minikube logs show a lot of goroutines that start with:
/go/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

for example:

grep "reflector.go:296" minikube-aug9err.log | wc -l
628

most of which look like this:

goroutine 1309 [select, 380 minutes]:
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc820100718, 0xc82006cea0, 0xc821f96780, 0xc8239da000, 0xc820100720)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:283 +0x3c8
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

goroutine 414518 [select, 172 minutes]:
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc821dd8cd0, 0xc822021740, 0xc821cff220, 0xc824550cc0, 0xc821dd8cd8)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:283 +0x3c8
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

goroutine 327671 [select, 212 minutes]:
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc82248a0d0, 0xc821f00fc0, 0xc820400460, 0xc827fd92c0, 0xc82248a0e0)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:283 +0x3c8
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

goroutine 1256 [select, 380 minutes]:
k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc820100638, 0xc82006cea0, 0xc821fc45a0, 0xc822ccd680, 0xc820100640)
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:283 +0x3c8
created by k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
    /usr/local/google/home/aprindle/go/src/k8s.io/minikube/_gopath/src/k8s.io/minikube/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:296 +0xde3

Not sure if that is helpful or not. Let me know if there is anything else that I should look for.

Looks like delta fifo's items map is read/written concurrently while reflector doing Resync in ListAndWatch https://github.com/kubernetes/kubernetes/blob/b0deb2eb8f4037421077f77cb163dbb4c0a2a9f5/pkg/client/cache/delta_fifo.go#L303 @lavalamp @wojtek-t maybe know more details about delta fifo?

@andrewgdavis do you know how to reproduce this?

deltafifo has the queue locked. I'd expect the culprit to actually be something else. @andrewgdavis if you post the entire stack dump it might be helpful.

Here is one of the 3 logs (I think they were all using minikube 0.7.1):
https://www.dropbox.com/s/4ng4p9aq4w0djpg/aug9err.log?dl=0

Let me know if you also would like to see the others.

@janetkuo no, i don't know how this was reproduced-- and I have not seen the issue since aug9. That said my environment changed-- I cherry picked kubernetes/kubernetes#28744 and built minikube-- then instead of using virtualBox, i went with xhyve.

Just got another dump from today. So the cherrypick and use of xhyve don't seem to matter.
If interested here is the latest from today (aug 17)
https://www.dropbox.com/s/qwbdqyntpaeeieq/aug17err.log?dl=0

See this one:
https://github.com/kubernetes/kubernetes/issues/30759#issuecomment-240366958

It seems to be the same issue.

Thanks for the stack dump.

For closure I think this PR finally fixed the issue:

https://github.com/kubernetes/kubernetes/pull/30948

@andrewgdavis thanks for the confirmation!

@aaron-prindle would it be possible to get out an 0.8.1 with this in it? Restarting multiple times a day because of panics isn't so fun.

Hey @jezell, we can do a release soon with this fix, but it would be much nicer if this made it into a real Kubernetes release first. It doesn't look like it's been cherry picked into a 1.3.* release yet, would you be fine with using a 1.4-alpha release to get this fix?

I would definitely

Cool, we'll publish one soon. You won't need 0.8.1 for that, you'll be able to use it with a --k8s_version flag and your existing build.

awesome! thanks

Looks like this issue impacts ingress-controller resulting in frequent restarts:

nginx-ingress-4280m             1/1       Running   25         5d
nginx-ingress-5dv2y             1/1       Running   25         5d
nginx-ingress-6pwtz             1/1       Running   16         3d
nginx-ingress-vt8cr             1/1       Running   24         5d

I am hitting this issue in v1.3.6:

fatal error: concurrent map read and map write

goroutine 140812 [running]:
runtime.throw(0x1b9f100, 0x21)
        /usr/local/go/src/runtime/panic.go:547 +0x90 fp=0xc82059eaf8 sp=0xc82059eae0
runtime.mapaccess1_faststr(0x148a4a0, 0xc820317f50, 0xc8208fd0e0, 0x15, 0x1)
        /usr/local/go/src/runtime/hashmap_fast.go:202 +0x5b fp=0xc82059eb58 sp=0xc82059eaf8
k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).queueActionLocked(0xc8200e5b80, 0x1a3e8c8, 0x4, 0x19dde80, 0xc820d01348, 0x0, 0x0)
        /usr/local/google/home/beeps/goproj/src/k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:305 +0x1dd fp=0xc82059ecc0 sp=0xc82059eb58
k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache.(*DeltaFIFO).Resync(0xc8200e5b80, 0x0, 0x0)
        /usr/local/google/home/beeps/goproj/src/k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache/delta_fifo.go:511 +0x4f7 fp=0xc82059ee38 sp=0xc82059ecc0
k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch.func1(0xc8200c01e0, 0xc82008ec00, 0xc820612000, 0xc820cf04e0, 0xc8200c01e8)
        /usr/local/google/home/beeps/goproj/src/k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache/reflector.go:289 +0x252 fp=0xc82059ef88 sp=0xc82059ee38
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82059ef90 sp=0xc82059ef88
created by k8s.io/contrib/ingress/vendor/k8s.io/kubernetes/pkg/client/cache.(*Reflector).ListAndWatch
Was this page helpful?
0 / 5 - 0 ratings