Calico: Calico fails to start after GKE 1.11.3-gke.18 upgrade

Created on 4 Dec 2018  路  5Comments  路  Source: projectcalico/calico

After upgrading my GKE cluster from 1.11.2-gke.18 to 1.11.3-gke.18 all pods refuse to start and calico-node is stuck in CrashLoopBackOff.

Expected Behavior

Calico to work and pods to start.

Current Behavior

The pods shows the following calico-related error:

Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "0d850f25f72d6d0c0436d1b58ac1310fc0626a03d4f93d098486c798737f7c53" network for pod "xxx": NetworkPlugin cni failed to set up pod "xxx" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/

calico-node and calico-typha shows errors during migration.

kubectl logs calico-node-9qkct -n kube-system -c calico-node
[...]
2018-12-03 21:14:33.670 [INFO][12] migrate.go 875: data converted successfully
2018-12-03 21:14:33.670 [INFO][12] migrate.go 866: Storing v3 data
2018-12-03 21:14:33.670 [INFO][12] migrate.go 875: Storing resources in v3 format
2018-12-03 21:14:33.707 [INFO][12] migrate.go 1151: Failed to create resource Key=BGPConfiguration(default) error=resource does not exist: BGPConfiguration(default) with error: the server could not find the requested resource (post BGPConfigurations.crd.projectcalico.org)
2018-12-03 21:14:33.707 [ERROR][12] migrate.go 884: Unable to store the v3 resources
2018-12-03 21:14:33.707 [INFO][12] migrate.go 875: cause: resource does not exist: BGPConfiguration(default) with error: the server could not find the requested resource (post BGPConfigurations.crd.projectcalico.org)
2018-12-03 21:14:33.707 [ERROR][12] startup.go 107: Unable to ensure datastore is migrated. error=Migration failed: error storing converted data: resource does not exist: BGPConfiguration(default) with error: the server could not find the requested resource (post BGPConfigurations.crd.projectcalico.org)
2018-12-03 21:14:33.707 [WARNING][12] startup.go 1066: Terminating
kubectl logs calico-typha-5977794b76-skhqv -n kube-system
2018-12-04 07:12:01.277 [ERROR][6] migrate.go 884: Unable to store the v3 resources
2018-12-04 07:12:01.277 [ERROR][6] daemon.go 288: Failed to migrate Kubernetes v1 configuration to v3 error=error storing converted data: resource does not exist: BGPConfiguration(default) with error: the server could not find the requested resource (post BGPConfigurations.crd.projectcalico.org)
kinbug

All 5 comments

We are running into the same problems after updating GKE to 1.11.3-gke.18
The upgrade we did was from 1.11.2-gke.?? to 1.11.3-gke.18

We have found a thread discussing a fix:
https://issuetracker.google.com/issues/120255782

This fixed it for us.

@eveld Thanks. As this was just a test cluster I already deleted it and created a new one. That fix would have saved me some time, though :) Hopefully they will patch this bug in GKE soon.

Yep, the workaround in that issue is the correct one.

I've also submitted a fix upstream to add this CRD to the addon manager: https://github.com/kubernetes/kubernetes/pull/71682

This has been fixed in upstream GKE, thanks all!

Was this page helpful?
0 / 5 - 0 ratings