Calico: Calico readiness and liveliness probe fails

Created on 18 Jun 2018  路  20Comments  路  Source: projectcalico/calico

Seems like Calico is trying to start the worker node process on the same IPv4 address as the the one on the master node. Hence it is failing and erroring out. How to force the worker node process to use a different IPv4 address?

Kube version : 1.10.4

Describe pod

kubectl describe pods calico-node-9kftd -n kube-system
```Name: calico-node-9kftd
Namespace: kube-system
Node: worker1.k8s/192.168.99.7
Start Time: Sun, 17 Jun 2018 18:49:10 +0530
Labels: controller-revision-hash=1808776410
k8s-app=calico-node
pod-template-generation=1
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 192.168.99.7
Controlled By: DaemonSet/calico-node
Containers:
calico-node:
Container ID: docker://2d88c0d7f10601aef1229e8c79023ce06743fbe5507b39d8b964e7d909ec78c9
Image: quay.io/calico/node:v3.1.3
Image ID: docker-pullable://quay.io/calico/node@sha256:a35541153f7695b38afada46843c64a2c546548cd8c171f402621736c6cf3f0b
Port:
Host Port:
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 18 Jun 2018 10:00:18 +0530
Finished: Mon, 18 Jun 2018 10:00:18 +0530
Ready: False
Restart Count: 23
Requests:
cpu: 250m
Liveness: http-get http://:9099/liveness delay=10s timeout=1s period=10s #success=1 #failure=6
Readiness: http-get http://:9099/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
DATASTORE_TYPE: kubernetes
FELIX_LOGSEVERITYSCREEN: info
CLUSTER_TYPE: k8s,bgp
CALICO_DISABLE_FILE_LOGGING: true
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
FELIX_IPV6SUPPORT: false
FELIX_IPINIPMTU: 1440
WAIT_FOR_DATASTORE: true
CALICO_IPV4POOL_CIDR: 192.168.0.0/16
CALICO_IPV4POOL_IPIP: Always
FELIX_IPINIPENABLED: true
FELIX_TYPHAK8SSERVICENAME: Optional: false
NODENAME: (v1:spec.nodeName)
IP: autodetect
FELIX_HEALTHENABLED: true
Mounts:
/lib/modules from lib-modules (ro)
/var/lib/calico from var-lib-calico (rw)
/var/run/calico from var-run-calico (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-zggt6 (ro)
install-cni:
Container ID: docker://76a0c72b569b99bcb4ad0c82a7b899c4034f258c907befee4dee5154fd6713f8
Image: quay.io/calico/cni:v3.1.3
Image ID: docker-pullable://quay.io/calico/cni@sha256:ed172c28bc193bb09bce6be6ed7dc6bfc85118d55e61d263cee8bbb0fd464a9d
Port:
Host Port:
Command:
/install-cni.sh
State: Running
Started: Mon, 18 Jun 2018 09:48:52 +0530
Ready: True
Restart Count: 2
Environment:
CNI_CONF_NAME: 10-calico.conflist
CNI_NETWORK_CONFIG: Optional: false
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-zggt6 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
var-run-calico:
Type: HostPath (bare host directory volume)
Path: /var/run/calico
HostPathType:
var-lib-calico:
Type: HostPath (bare host directory volume)
Path: /var/lib/calico
HostPathType:
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-net-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
calico-node-token-zggt6:
Type: Secret (a volume populated by a Secret)
SecretName: calico-node-token-zggt6
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: :NoSchedule
:NoExecute
:NoSchedule
:NoExecute
CriticalAddonsOnly
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-net-dir"
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-run-calico"
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-lib-calico"
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "lib-modules"
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-bin-dir"
Normal SuccessfulMountVolume 15h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "calico-node-token-zggt6"
Warning Failed 15h kubelet, worker1.k8s Failed to pull image "quay.io/calico/cni:v3.1.3": rpc error: code = Unknown desc = Error response from daemon: Get https://quay.io/v2/calico/cni/manifests/v3.1.3: Get https://quay.io/v2/auth?scope=repository%3Acalico%2Fcni%3Apull&service=quay.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 15h kubelet, worker1.k8s Error: ErrImagePull
Warning Failed 15h kubelet, worker1.k8s Failed to pull image "quay.io/calico/node:v3.1.3": rpc error: code = Unknown desc = Error response from daemon: Get https://quay.io/v2/calico/node/manifests/v3.1.3: dial tcp 50.17.235.205:443: i/o timeout
Normal Pulling 15h (x2 over 15h) kubelet, worker1.k8s pulling image "quay.io/calico/cni:v3.1.3"
Normal Pulled 15h kubelet, worker1.k8s Successfully pulled image "quay.io/calico/cni:v3.1.3"
Normal Created 15h kubelet, worker1.k8s Created container
Normal Started 15h kubelet, worker1.k8s Started container
Normal Pulling 15h (x3 over 15h) kubelet, worker1.k8s pulling image "quay.io/calico/node:v3.1.3"
Warning Failed 15h (x3 over 15h) kubelet, worker1.k8s Error: ErrImagePull
Warning Failed 15h (x2 over 15h) kubelet, worker1.k8s Failed to pull image "quay.io/calico/node:v3.1.3": rpc error: code = Unknown desc = Error response from daemon: Get https://quay.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 15h (x2 over 15h) kubelet, worker1.k8s Error: ImagePullBackOff
Normal BackOff 15h (x16 over 15h) kubelet, worker1.k8s Back-off pulling image "quay.io/calico/node:v3.1.3"
Normal Pulled 14h (x12 over 14h) kubelet, worker1.k8s Container image "quay.io/calico/node:v3.1.3" already present on machine
Warning BackOff 14h (x121 over 14h) kubelet, worker1.k8s Back-off restarting failed container
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-lib-calico"
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-run-calico"
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-bin-dir"
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-net-dir"
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "lib-modules"
Normal SuccessfulMountVolume 3h kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "calico-node-token-zggt6"
Normal SandboxChanged 3h kubelet, worker1.k8s Pod sandbox changed, it will be killed and re-created.
Normal Pulled 3h kubelet, worker1.k8s Container image "quay.io/calico/cni:v3.1.3" already present on machine
Normal Created 3h kubelet, worker1.k8s Created container
Normal Started 3h kubelet, worker1.k8s Started container
Warning Unhealthy 3h (x2 over 3h) kubelet, worker1.k8s Readiness probe failed: Get http://192.168.99.7:9099/readiness: dial tcp 192.168.99.7:9099: getsockopt: connection refused
Warning Unhealthy 3h (x2 over 3h) kubelet, worker1.k8s Liveness probe failed: Get http://192.168.99.7:9099/liveness: dial tcp 192.168.99.7:9099: getsockopt: connection refused
Normal Started 3h (x2 over 3h) kubelet, worker1.k8s Started container
Normal Pulled 3h (x2 over 3h) kubelet, worker1.k8s Container image "quay.io/calico/node:v3.1.3" already present on machine
Normal Created 3h (x2 over 3h) kubelet, worker1.k8s Created container
Warning BackOff 3h (x47 over 3h) kubelet, worker1.k8s Back-off restarting failed container
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-net-dir"
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-lib-calico"
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "cni-bin-dir"
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "var-run-calico"
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "lib-modules"
Normal SuccessfulMountVolume 12m kubelet, worker1.k8s MountVolume.SetUp succeeded for volume "calico-node-token-zggt6"
Normal SandboxChanged 12m kubelet, worker1.k8s Pod sandbox changed, it will be killed and re-created.
Normal Pulled 12m kubelet, worker1.k8s Container image "quay.io/calico/cni:v3.1.3" already present on machine
Normal Created 12m kubelet, worker1.k8s Created container
Normal Started 12m kubelet, worker1.k8s Started container

Warning Unhealthy 12m (x2 over 12m) kubelet, worker1.k8s Liveness probe failed: Get http://192.168.99.7:9099/liveness: dial tcp 192.168.99.7:9099: getsockopt: connection refused

Warning Unhealthy 11m (x3 over 12m) kubelet, worker1.k8s Readiness probe failed: Get http://192.168.99.7:9099/readiness: dial tcp 192.168.99.7:9099: getsockopt: connection refused

Normal Started 11m (x2 over 12m) kubelet, worker1.k8s Started container
Normal Created 11m (x2 over 12m) kubelet, worker1.k8s Created container
Normal Pulled 11m (x2 over 12m) kubelet, worker1.k8s Container image "quay.io/calico/node:v3.1.3" already present on machine
Warning BackOff 2m (x47 over 11m) kubelet, worker1.k8s Back-off restarting failed container

### Container Log:
`kubectl logs calico-node-9kftd -n kube-system -c calico-node`
```2018-06-18 04:45:36.720 [INFO][9] startup.go 251: Early log level set to info
2018-06-18 04:45:36.720 [INFO][9] startup.go 267: Using NODENAME environment for node name
2018-06-18 04:45:36.720 [INFO][9] startup.go 279: Determined node name: worker1.k8s
2018-06-18 04:45:36.724 [INFO][9] startup.go 302: Checking datastore connection
2018-06-18 04:45:36.754 [INFO][9] startup.go 326: Datastore connection verified
2018-06-18 04:45:36.755 [INFO][9] startup.go 99: Datastore is ready
2018-06-18 04:45:36.783 [INFO][9] startup.go 564: Using autodetected IPv4 address on interface enp0s8: 10.0.3.15/24
2018-06-18 04:45:36.783 [INFO][9] startup.go 432: Node IPv4 changed, will check for conflicts
2018-06-18 04:45:36.798 [WARNING][9] startup.go 861: Calico node 'master' is already using the IPv4 address 10.0.3.15.
2018-06-18 04:45:36.798 [INFO][9] startup.go 205: Clearing out-of-date IPv4 address from this node IP="10.0.3.15/24"
2018-06-18 04:45:36.826 [WARNING][9] startup.go 1058: Terminating
kinsupport

Most helpful comment

had the exact same issue. @tmjd thanks for the hint.
you need to set auto-detect to use another method suitable for your network. E.g. adding following to calico yaml:

- name: IP_AUTODETECTION_METHOD
              value: "interface=eth.*"

worked for me.

All 20 comments

I am stuck with this issue :( and unable to progress. Any input to resolve this will be really helpful

Is 10.0.3.15 the address you would expect the master and worker to use to communicate over? I'm guessing that is not what you expect.
Do you have multiple interfaces on your hosts? You might need to change the IP Autodetection so that the proper interface/address is selected. https://docs.projectcalico.org/v3.1/reference/node/configuration#ip-autodetection-methods

@tmjd

real ip is fail,but 127.0.0.1 is work

kubectl describe po -n kube-system calico-node-6nkrj

Warning Unhealthy 6s (x2 over 16s) kubelet, master1 Readiness probe failed: Get http://192.168.17.207:9099/readiness: dial tcp 192.168.17.207:9099: connect: connection refused
Warning Unhealthy 0s (x2 over 10s) kubelet, master1 Liveness probe failed: Get http://192.168.17.207:9099/liveness: dial tcp 192.168.17.207:9099: connect: connection refused

root@master1:/etc/kubernetes/addons/calico/etcd# curl http://192.168.17.207:9099/readiness
curl: (7) Failed to connect to 192.168.17.207 port 9099: 鎷掔粷杩炴帴

root@master1:/etc/kubernetes/addons/calico/etcd# curl http://127.0.0.1:9099/readiness

i know
I'm in hostNetwork mode,so I shoule set host: 127.0.0.1

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes

image

had the exact same issue. @tmjd thanks for the hint.
you need to set auto-detect to use another method suitable for your network. E.g. adding following to calico yaml:

- name: IP_AUTODETECTION_METHOD
              value: "interface=eth.*"

worked for me.

@lurenjia528 Please include some logs from one of the calico-nodes that the readiness probe is failing against.

@uvnikgupta did you get to the bottom of this?

Thanks to all for the replies.
I was finally able to resolve the issue. Thanks to @tmjd for the hint. I had two interfaces on each of my Ubuntu VMs, enp0s3 and enp0s8. the enp0s8 interface had the same IP on all the three VMs hence the calico nodes on the slave were complaining about the IP conflict. To resolve this problem I edited my /etc/network/interfaces file and assigned static IPs to enpos8 interface. this resolved the problem.

I even tried the suggestion by @mridup but it did not work for me as it would always discover the enp0s8 and start complaining about the conflicting IP.

@tmjd
I have exactly same issue. In my VMs, I have both enp0s3 and enp0s8, where enp0s3 with dynamic IP, so that enp0s3 in two machines have same value, listed below.
I am using default yaml file from https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml. There was no any extra environment setting at the time 'kubectl apply -f calico.yaml'. This seems having 'IP_AUTODETECTION_METHOD=first-found' so that enp0s3 was picked up for two calico-node-xxxx pods, and both of them have same IP address below, 10.0.2.15/24, the one in master works fine while the one in node will readiness connection refused.
I am seeking help to nominate an interface at the time kubectl apply -f calico.yaml so that both of the pods can pickup IP address from enp0s8, which should be 192.168.56.110 for master and 192.168.56.110 for node in my case.
Great thanks for any suggestion.

2: enp0s3: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:77:f6:71 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic enp0s3
valid_lft 85007sec preferred_lft 85007sec
inet6 fe80::8156:9f5c:4c27:3cc7/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp0s8: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:41:a1:22 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.110/24 brd 192.168.56.255 scope global noprefixroute enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::340b:6749:6680:6ea8/64 scope link noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::4929:b71b:11ed:3d86/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever

Have you reviewed the docs at https://docs.projectcalico.org/v3.1/reference/node/configuration#ip-autodetection-methods? Have you tried changing the autodetection method, it sounds like you want to select enp0s8 so you should be able to set IP_AUTODETECTION_METHOD to interface=enp0s8? Though you said the IP for both master and client is 192.168.56.110 which would not work, the IP selected needs to be different for each node so you will need to update your hosts to have different addresses that they can use to communicate over.

in my two machines, one enp0s8 is with 192.168.56.110 and the other one with 192.168.56.117, which is the ones I would like for calico-nodes. I have read the link again and again just don't now how to.

  1. I first tried to modified calico.yaml, by adding IP_AUTODETECTION_METHOD under IP, wish I am correct,
    # Auto-detect the BGP IP address.
    - name: IP
    value: "autodetect"
    - name: IP_AUTODETECTION_METHOD
    value: "interface=enp0s8"
  2. run shell command of 'export IP_AUTODETECTION_METHOD=interface=enp0s8'

with either one above then followed by below,
kubectl delete -f calico.yaml
kubectl apply -f calico.yaml

I am still seeing the calico-node-xxxx picked up IP address from enp0s8
[root@centos7b2 ~]# kubectl describe pods -n kube-system calico-node-mkv8t
Name: calico-node-mkv8t
Namespace: kube-system
Priority: 0
PriorityClassName:
Node: centos7g2/10.0.2.15
Start Time: Mon, 24 Sep 2018 13:43:15 -0700
Labels: controller-revision-hash=1427857993
k8s-app=calico-node
pod-template-generation=1
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.0.2.15
Controlled By: DaemonSet/calico-node

More info from pod description, showed below,

  1. pod ip, it comes from enp0s3
  2. environment valuable, IP, has the same value set in yaml file, same as enp0s8. This means environment valuable either didn't take effect or got overwritten by something else. would it be possible taking from node IP address?

[root@centos7b2 ~]# kubectl describe pod/calico-node-plc4j -n kube-system
Name: calico-node-plc4j
Namespace: kube-system
Priority: 0
PriorityClassName:
Node: centos7g2/10.0.2.15
Start Time: Mon, 24 Sep 2018 15:24:22 -0700
Labels: controller-revision-hash=3382383129
k8s-app=calico-node
pod-template-generation=1
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Status: Running
IP: 10.0.2.15
Controlled By: DaemonSet/calico-node
Containers:
calico-node:
Container ID: docker://2eaf30ae5e8d589f7611f0b955bca73529b55d54d21de452247cc653013e7b3a
Image: quay.io/calico/node:v3.1.3
Image ID: docker-pullable://quay.io/calico/node@sha256:a35541153f7695b38afada46843c64a2c546548cd8c171f402621736c6cf3f0b
Port:
Host Port:
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 24 Sep 2018 15:28:26 -0700
Finished: Mon, 24 Sep 2018 15:28:36 -0700
Ready: False
Restart Count: 5
Requests:
cpu: 250m
Liveness: http-get http://:9099/liveness delay=10s timeout=1s period=10s #success=1 #failure=6
Readiness: http-get http://:9099/readiness delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ETCD_ENDPOINTS: Optional: false
CALICO_NETWORKING_BACKEND: Optional: false
CLUSTER_TYPE: kubeadm,bgp
CALICO_DISABLE_FILE_LOGGING: true
CALICO_K8S_NODE_REF: (v1:spec.nodeName)
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
CALICO_IPV4POOL_CIDR: 192.168.0.0/16
CALICO_IPV4POOL_IPIP: Always
FELIX_IPV6SUPPORT: false
FELIX_IPINIPMTU: 1440
FELIX_LOGSEVERITYSCREEN: info
IP: 192.168.56.117
FELIX_HEALTHENABLED: true
Mounts:
/lib/modules from lib-modules (ro)
/var/lib/calico from var-lib-calico (rw)
/var/run/calico from var-run-calico (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-cni-plugin-token-pn5r8 (ro)
install-cni:
Container ID: docker://d6d2ec602652d05d07e34a8e1c9c3a1e029b76617b6ea1d9b30d56caef6ede39
Image: quay.io/calico/cni:v3.1.3
Image ID: docker-pullable://quay.io/calico/cni@sha256:ed172c28bc193bb09bce6be6ed7dc6bfc85118d55e61d263cee8bbb0fd464a9d
Port:
Host Port:
Command:
/install-cni.sh
State: Running
Started: Mon, 24 Sep 2018 15:24:23 -0700
Ready: True
Restart Count: 0
Environment:
CNI_CONF_NAME: 10-calico.conflist
ETCD_ENDPOINTS: Optional: false
CNI_NETWORK_CONFIG: Optional: false
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-cni-plugin-token-pn5r8 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
var-run-calico:
Type: HostPath (bare host directory volume)
Path: /var/run/calico
HostPathType:
var-lib-calico:
Type: HostPath (bare host directory volume)
Path: /var/lib/calico
HostPathType:
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-net-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
calico-cni-plugin-token-pn5r8:
Type: Secret (a volume populated by a Secret)
SecretName: calico-cni-plugin-token-pn5r8
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: :NoSchedule
:NoExecute
CriticalAddonsOnly
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/unreachable:NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started 5m kubelet, centos7g2 Started container
Normal Pulled 5m kubelet, centos7g2 Container image "quay.io/calico/cni:v3.1.3" already present on machine
Normal Created 5m kubelet, centos7g2 Created container
Warning Unhealthy 4m (x2 over 5m) kubelet, centos7g2 Readiness probe failed: Get http://10.0.2.15:9099/readiness: dial tcp 10.0.2.15:9099: connect: connection refused
Warning BackOff 4m (x2 over 4m) kubelet, centos7g2 Back-off restarting failed container
Normal Started 4m (x3 over 5m) kubelet, centos7g2 Started container
Normal Pulled 4m (x3 over 5m) kubelet, centos7g2 Container image "quay.io/calico/node:v3.1.3" already present on machine
Normal Created 4m (x3 over 5m) kubelet, centos7g2 Created container
Warning DNSConfigForming 7s (x37 over 5m) kubelet, centos7g2 Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.1.7.51 10.78.100.8 10.1.7.52

@tmjd
Hi Eric,
Any update? thanks.

  • name: IP
    value: "autodetect"
  • name: IP_AUTODETECTION_METHOD
    value: "interface=enp0s8"

Like you have above that is what you should have in your calico.yaml.

Is there perhaps another entry in the manifest with IP being set to something else?
I'm guessing there is since when you do a describe on the pod you see IP: 192.168.56.117

@tmjd Thanks for your quick response.
any way to check whether another entry in the manifest with IP being set. It is out of my mind and I am very familiar with the manifest. Thanks.

For people like me, that have just started to explore the world of k8s with a bunch of virtual boxes, and actually just want to see something like 2 nginx pods running on 2 different nodes, the network setup has turned out to be a real nightmare. Coming from docker swarm, everything was easy. Now, I see myself digging into iptables and yaml files, that are interconnected and need to be tweaked in a very special way. Don't get me wrong - I am willing to learn whatever is necessary to manage my 3 nodes, but I am also frustrated to be sidetracked by the network, which just needs to know 2 things: where is the cluster, and which addresses can I use for my components.

had the exact same issue. @tmjd thanks for the hint.
you need to set auto-detect to use another method suitable for your network. E.g. adding following to calico yaml:

- name: IP_AUTODETECTION_METHOD
              value: "interface=eth.*"

worked for me.

Hey everyone, thanks indeed for your help, cause I was hopeless. I followed the same workaround, but in my case, I have 3 different nodes, which two of them have the interface name of "eth0", but my worker node's interface name is "enp3s0f0".
in the calico.yaml file, I added the line

            - name: IP_AUTODETECTION_METHOD
              value: "can-reach=8.8.8.8"

but still no result.
I tried the regex as mentioned here
like following:

            - name: IP_AUTODETECTION_METHOD
              value: "interface=e*"

calico version : 3.9

kubectl version
Client Version:version.Info{
   Major:"1",
   Minor:"17",
   GitVersion:"v1.17.2",
   GitTreeState:"clean",
   GoVersion:"go1.13.5",
   Compiler:"gc",
   Platform:"linux/amd64"
}Server Version:version.Info{
   Major:"1",
   Minor:"17",
   GitVersion:"v1.17.2",
   GitTreeState:"clean",
   GoVersion:"go1.13.5",
   Compiler:"gc",
   Platform:"linux/amd64"
}

@madmesi you can give multiple interface names like,

            - name: IP_AUTODETECTION_METHOD
              value: "interface=enp8s0,ens192"

@madmesi you can give multiple interface names like,

            - name: IP_AUTODETECTION_METHOD
              value: "interface=enp8s0,ens192"

Amazing...thank you very very much for this. I was running to the same problem with version 1.19.0 and latest release with Calico and I was wondering what is the workaround on this. I am deploying on multiple nodes and some have different interface that are listening. Your solution is the only thing that worked for me :)

adding localhost entries in /etc/hosts file helped to resolve this issue

[root@centos7 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

Was this page helpful?
0 / 5 - 0 ratings