Linkerd2: All linkerd components in Init:CrashLoopBackOff

Created on 11 Oct 2019 · 39Comments · Source: linkerd/linkerd2

Having below issue with all linkerd components, after circa 48hrs from pod creation.

Environment:
kubernetes: GitVersion:"v1.13.2"
docker package: docker-1.13.1-103.git7f2769b.el7.centos.x86_64

linkerd version
Client version: edge-19.10.1
Server version: edge-19.10.1

Kubernetes running on virtual machines in Google Cloud Platform, with host OS: CentOS Linux release 7.6.1810 (Core)

kubectl get pod -n linkerd
NAME                                      READY   STATUS                  RESTARTS   AGE
linkerd-controller-69d84c4f8c-nd96z       0/2     Init:CrashLoopBackOff   544        2d
linkerd-destination-77bcd7497c-57gqf      0/2     Init:CrashLoopBackOff   544        2d
linkerd-grafana-69b7c55969-mf4h5          0/2     Init:CrashLoopBackOff   544        2d
linkerd-identity-6b6854c8f7-mcw74         0/2     Init:Error              545        2d
linkerd-prometheus-9d59769cc-rjmf8        0/2     Init:CrashLoopBackOff   545        2d
linkerd-proxy-injector-686fd49d85-p2cfc   0/2     Init:CrashLoopBackOff   544        2d
linkerd-sp-validator-77867c74fd-8zgw7     0/2     Init:CrashLoopBackOff   545        2d
linkerd-tap-6c647878c5-bpc2l              0/2     Init:CrashLoopBackOff   545        2d
linkerd-web-7dc9c4b794-vlhqg              0/2     Init:CrashLoopBackOff   544        2d

Snippet from pod description:

kubectl describe pod linkerd-destination-77bcd7497c-57gqf -n linkerd

---
Init Containers:
  linkerd-init:
    Container ID:  docker://e0cd95a592055a5f8e3a758a324a7706a90f74e44d5f753ff697e7a3a379086b
    Image:         gcr.io/linkerd-io/proxy-init:v1.2.0
    Image ID:      docker-pullable://gcr.io/linkerd-io/proxy-init@sha256:c0174438807cdd711867eb1475fba3dd959d764358de4e5f732177e07a75925b
    Port:          <none>
    Host Port:     <none>
    Args:
      --incoming-proxy-port
      4143
      --outgoing-proxy-port
      4140
      --proxy-uid
      2102
      --inbound-ports-to-ignore
      4190,4191
      --outbound-ports-to-ignore
      443
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   2019/10/11 12:52:18 < iptables: Too many links.

2019/10/11 12:52:18 Will ignore port 4190 on chain PROXY_INIT_REDIRECT
2019/10/11 12:52:18 Will ignore port 4191 on chain PROXY_INIT_REDIRECT
2019/10/11 12:52:18 Will redirect all INPUT ports to proxy
2019/10/11 12:52:18 > iptables -t nat -F PROXY_INIT_OUTPUT
2019/10/11 12:52:18 <
2019/10/11 12:52:18 > iptables -t nat -X PROXY_INIT_OUTPUT
2019/10/11 12:52:18 < iptables: Too many links.

2019/10/11 12:52:18 Ignoring uid 2102
2019/10/11 12:52:18 Will ignore port 443 on chain PROXY_INIT_OUTPUT
2019/10/11 12:52:18 Redirecting all OUTPUT to 4140
2019/10/11 12:52:18 Executing commands:
2019/10/11 12:52:18 > iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1570798338
2019/10/11 12:52:18 < iptables: Chain already exists.

2019/10/11 12:52:18 Aborting firewall configuration
Error: exit status 1
---

Source

jurgengrech

👍1

Most helpful comment

Thank you guys, think I found problem, that my kube nodes were cleaning unused images every night) But can we fix anyway the problem with cleaning old firewall rules?

~ crontab -l 0 0 * * * yes | docker system prune -a --volumes

samlabs821 on 22 Nov 2019

🚀3

All 39 comments

I assume you're using linkerd-cni?

grampelberg on 11 Oct 2019

No, I am using projectcalico's canal.

jurgengrech on 14 Oct 2019

Please share the full pod yaml, how you installed k8s and the full install command that you used for linkerd.

grampelberg on 14 Oct 2019

K8s is being installed / run using hyperkube (https://stackoverflow.com/questions/33953254/what-is-hyperkube)

Linkerd is being installed using commands below:

curl https://run.linkerd.io/install-edge | sh
linkerd install --cluster-domain=licence.local --identity-trust-domain licence.local | kubectl apply -f -

Below, yaml for a canal pod:

kubectl get pod -n kube-system canal-knr66 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: "2019-10-03T12:35:31Z"
  generateName: canal-
  labels:
    controller-revision-hash: 74fd5c88bb
    k8s-app: canal
    pod-template-generation: "1"
  name: canal-knr66
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: canal
    uid: 8c07d63d-e5d9-11e9-85ef-42010a8c000a
  resourceVersion: "5142287"
  selfLink: /api/v1/namespaces/kube-system/pods/canal-knr66
  uid: 49dbd06d-e5da-11e9-85ef-42010a8c000a
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - ab2bn07
  containers:
  - env:
    - name: DATASTORE_TYPE
      value: kubernetes
    - name: WAIT_FOR_DATASTORE
      value: "true"
    - name: NODENAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: CALICO_NETWORKING_BACKEND
      value: none
    - name: CLUSTER_TYPE
      value: k8s,canal
    - name: FELIX_IPTABLESREFRESHINTERVAL
      value: "60"
    - name: IP
    - name: CALICO_IPV4POOL_CIDR
      value: 192.168.0.0/16
    - name: CALICO_DISABLE_FILE_LOGGING
      value: "true"
    - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
      value: ACCEPT
    - name: FELIX_IPV6SUPPORT
      value: "false"
    - name: FELIX_LOGSEVERITYSCREEN
      value: info
    - name: FELIX_HEALTHENABLED
      value: "true"
    image: quay.io/calico/node:v3.2.3
    imagePullPolicy: Always
    livenessProbe:
      failureThreshold: 6
      httpGet:
        host: localhost
        path: /liveness
        port: 9099
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    name: calico-node
    readinessProbe:
      failureThreshold: 3
      httpGet:
        host: localhost
        path: /readiness
        port: 9099
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      requests:
        cpu: 250m
    securityContext:
      privileged: true
      procMount: Default
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /lib/modules
      name: lib-modules
      readOnly: true
    - mountPath: /var/run/calico
      name: var-run-calico
    - mountPath: /var/lib/calico
      name: var-lib-calico
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: canal-token-wmcp9
      readOnly: true
  - command:
    - /install-cni.sh
    env:
    - name: CNI_CONF_NAME
      value: 10-canal.conflist
    - name: KUBERNETES_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: CNI_NETWORK_CONFIG
      valueFrom:
        configMapKeyRef:
          key: cni_network_config
          name: canal-config
    image: quay.io/calico/cni:v3.2.3
    imagePullPolicy: Always
    name: install-cni
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /host/opt/cni/bin
      name: cni-bin-dir
    - mountPath: /host/etc/cni/net.d
      name: cni-net-dir
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: canal-token-wmcp9
      readOnly: true
  - command:
    - /opt/bin/flanneld
    - --ip-masq
    - --kube-subnet-mgr
    env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: FLANNELD_IFACE
      valueFrom:
        configMapKeyRef:
          key: canal_iface
          name: canal-config
    - name: FLANNELD_IP_MASQ
      valueFrom:
        configMapKeyRef:
          key: masquerade
          name: canal-config
    image: quay.io/coreos/flannel:v0.9.1
    imagePullPolicy: Always
    name: kube-flannel
    resources: {}
    securityContext:
      privileged: true
      procMount: Default
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /run
      name: run
    - mountPath: /etc/kube-flannel/
      name: flannel-cfg
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: canal-token-wmcp9
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: ab2bn07
  nodeSelector:
    beta.kubernetes.io/os: linux
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: canal
  serviceAccountName: canal
  terminationGracePeriodSeconds: 0
  tolerations:
  - effect: NoSchedule
    operator: Exists
  - key: CriticalAddonsOnly
    operator: Exists
  - effect: NoExecute
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - hostPath:
      path: /lib/modules
      type: ""
    name: lib-modules
  - hostPath:
      path: /var/run/calico
      type: ""
    name: var-run-calico
  - hostPath:
      path: /var/lib/calico
      type: ""
    name: var-lib-calico
  - hostPath:
      path: /run
      type: ""
    name: run
  - configMap:
      defaultMode: 420
      name: canal-config
    name: flannel-cfg
  - hostPath:
      path: /opt/cni/bin
      type: ""
    name: cni-bin-dir
  - hostPath:
      path: /etc/cni/net.d
      type: ""
    name: cni-net-dir
  - name: canal-token-wmcp9
    secret:
      defaultMode: 420
      secretName: canal-token-wmcp9
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-10-03T12:35:31Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-10-03T14:43:21Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-10-03T14:43:21Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-10-03T12:35:31Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://c93cf934d6d0d27642aa79717a3cf713ae98c1607ec773436e4ef183fa8cf446
    image: quay.io/calico/node:v3.2.3
    imageID: docker-pullable://quay.io/calico/node@sha256:e546014887cd5663cdd199d3a84dfc355cfe704dedfd1b489c799e040e7e828f
    lastState: {}
    name: calico-node
    ready: true
    restartCount: 1
    state:
      running:
        startedAt: "2019-10-03T14:43:07Z"
  - containerID: docker://ae377bdd8663e7735d842ca19e29ed6986fce3894241b13513122cf4c8f8d01f
    image: quay.io/calico/cni:v3.2.3
    imageID: docker-pullable://quay.io/calico/cni@sha256:ae3352d2c5dc1631a82777b4f584655d78089d9c8aa5bcdf535dfc0f5deea87a
    lastState: {}
    name: install-cni
    ready: true
    restartCount: 1
    state:
      running:
        startedAt: "2019-10-03T14:43:12Z"
  - containerID: docker://34f785afb9a04801131074de60cdf3fc3ae8fc9e7e6f57c7f76bec42a92f2e6c
    image: quay.io/coreos/flannel:v0.9.1
    imageID: docker-pullable://quay.io/coreos/flannel@sha256:60d77552f4ebb6ed4f0562876c6e2e0b0e0ab873cb01808f23f55c8adabd1f59
    lastState: {}
    name: kube-flannel
    ready: true
    restartCount: 1
    state:
      running:
        startedAt: "2019-10-03T14:43:18Z"
  hostIP: 10.140.0.27
  phase: Running
  podIP: 10.140.0.27
  qosClass: Burstable
  startTime: "2019-10-03T12:35:31Z"

jurgengrech on 17 Oct 2019

@jurgengrech Does this 48 hour failure happen only when using hyperkube?

The too many links output is interesting because it means that some resources related to iptables are not being cleaned up. The next time that this happens, can you ssh in to the VM and get the contents of the iptables with the sudo iptables -L command? The output from journalctl -u kubelet might have valuable information as well.

Is there much "pod churn" is happening on this cluster? If there is a high frequency of stopping and starting pods, then that will affect iptables.

Finally, can you share the log output from the canal pods? I'd like to see how canal is interacting with the iptables.

cpretzer on 23 Oct 2019

I'm seeing the same issue, though my control plane breaks after 100 seconds, not 2 days. The control plane comes up, everything Ready, then around 90 seconds in, things break one by one, resulting in the same status / linkerd-init log output as posted above.

Kube version: 1.14.8
Linkerd version: stable-2.6.0

Not using linkerd-cni, calico for CNI, installed via linkerd install | kubectl apply -f -. Very minimal pod churn on this cluster.

And actually, when I reinstalled with linkerd-cni enabled, the control plane came up cleanly.

pinkertonpg on 28 Oct 2019

@pinkertonpg where is your cluster running?

The behavior of things breaking around 90 seconds sounds like there are resource limitations.

The linkerd-cni information is interesting as well. Which network overlay are you using?

cpretzer on 29 Oct 2019

@cpretzer This cluster is on EC2, four m4.xlarge workers. It's only used for testing, so there are no active workloads running on it that would be using up resources.

I'm just using Calico for networking.

pinkertonpg on 29 Oct 2019

@pinkertonpg have you checked the kubelet logs using journalctl?

I haven't heard of a configuration where linkerd-cni is required. Do you have PodSecurityPolicy in place that would prevent the proxy-init container from writing iptables rules?

cpretzer on 29 Oct 2019

@cpretzer I didn't see anything obvious in the kubelet logs, just the errors from the CrashLoopBackoffs of the control plane.

I don't believe I have an explicit psp preventing it, and the linkerd-linkerd-control-plane psp created via linkerd install seems to give the appropriate capabilities of NET_ADMIN and NET_RAW.

All this said, I'm planning on attempting a re-install in a fresh cluster. This particular one could be in an odd state due to recent upgrades/other service mesh testing.

pinkertonpg on 29 Oct 2019

@pinkertonpg sounds good.

Please update this issue when you run on the fresh cluster

cpretzer on 29 Oct 2019

i am facing the same issue. Running on HA mode, kubernetes 12.3, calico

``` State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Message: 2019/11/05 00:10:25 < iptables: Too many links.

2019/11/05 00:10:25 Will ignore port 4190 on chain PROXY_INIT_REDIRECT
2019/11/05 00:10:25 Will ignore port 4191 on chain PROXY_INIT_REDIRECT
2019/11/05 00:10:25 Will redirect all INPUT ports to proxy
2019/11/05 00:10:25 > iptables -t nat -F PROXY_INIT_OUTPUT
2019/11/05 00:10:25 <
2019/11/05 00:10:25 > iptables -t nat -X PROXY_INIT_OUTPUT
2019/11/05 00:10:25 < iptables: Too many links.

2019/11/05 00:10:25 Ignoring uid 2102
2019/11/05 00:10:25 Will ignore port 443 on chain PROXY_INIT_OUTPUT
2019/11/05 00:10:25 Redirecting all OUTPUT to 4140
2019/11/05 00:10:25 Executing commands:
2019/11/05 00:10:25 > iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1572912625
2019/11/05 00:10:25 < iptables: Chain already exists.

2019/11/05 00:10:25 Aborting firewall configuration
Error: exit status 1 ```

samlabs821 on 5 Nov 2019

@samlabs821 thanks for adding this information.

Would you mind answering a few questions about your cluster?

About how many injected pods are running?
Does this error occur at the same time that a deployment is rolled out?
Are you running on a managed kubernetes provider? (i.e. GKE, EKS, or AKS)?

cpretzer on 5 Nov 2019

I used default installation script that comes with linkerd cmd, with HA mode enabled

Currently only 1 pod outside of linkerd namespace is injected, and all pods inside linkerd injected
I was testing for about a week, and seems it occurs after a while (maybe 1 day)
We are running self hosted kubernetes, deployed with kubespray (k8s 1.12.3, calico, centos)

The thing is dashboard works fine

dev-0   Ready    master        327d   v1.12.3
dev-1   Ready    master,node   327d   v1.12.3
dev-2   Ready    master,node   327d   v1.12.3
dev-3   Ready    node          327d   v1.12.3
dev-4   Ready    node          327d   v1.12.3

samlabs821 on 5 Nov 2019

@samlabs821 @cpretzer I wonder if the too many links error is just a side effect of multiple restarts of the pods, where the restarts are caused by some other errors.

Does kubectl -n linkerd get po show many restarts? Anything interesting in the logs of the control plane pods (linkerd logs --control-plane-component=[controller|identity|destination|prometheus|etc.])? Can you check if your nodes are under memory pressure? Thanks.

UPDATE: Actually, there probably won't be anything in linkerd logs since the crash happens during init.

ihcsim on 5 Nov 2019

is that kind of errors says something?
```~ ❯❯❯ linkerd logs --control-plane-component=identity | grep ERR
linkerd-identity-5bc5667bc8-dw47f linkerd-proxy ERR! [315604.011711s] linkerd2_proxy::app::errors unexpected error: connection closed before message completed
linkerd-identity-5bc5667bc8-dw47f linkerd-proxy ERR! [315824.011750s] linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)
linkerd-identity-5bc5667bc8-dw47f linkerd-proxy ERR! [316504.008836s] linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)
linkerd-identity-5bc5667bc8-dpsx6 linkerd-proxy ERR! [319017.458966s] linkerd2_proxy::app::errors unexpected error: connection closed before message completed
linkerd-identity-5bc5667bc8-wvsrb linkerd-proxy ERR! [318693.482984s] linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)
linkerd-identity-5bc5667bc8-dw47f linkerd-proxy ERR! [319834.009773s] linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)
linkerd-identity-5bc5667bc8-wvsrb linkerd-proxy ERR! [319273.484515s] linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)

And many errors on prometheus, destination and controller:

linkerd-destination-6844d9675-m9kzp linkerd-proxy ERR! [336040.010943s] admin={bg=identity} linkerd2_proxy::app::identity Failed to certify identity: grpc-status: Unknown, grpc-message: "the request could not be dispatched in a timely fashion"
```
The memory and CPU usage seems fine

samlabs821 on 5 Nov 2019

small update. I wanted to reproduce this again on my cluster: i did clean install with HA mode, and injected 12 pods

I was able to install and run all components, after one day got

--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
× [prometheus] control plane can talk to Prometheus
    Error calling Prometheus from the control plane: server_error: server error: 502
    see https://linkerd.io/checks/#l5d-api-control-api for hints

~ ❯❯❯ kubectl get po -n linkerd
NAME                                      READY   STATUS                  RESTARTS   AGE
linkerd-controller-595b65547f-78b6n       0/3     Init:CrashLoopBackOff   70         18h
linkerd-controller-595b65547f-k7r2v       0/3     Init:CrashLoopBackOff   70         18h
linkerd-controller-595b65547f-kk52d       0/3     Init:CrashLoopBackOff   70         18h
linkerd-destination-5d8f7bf64c-929r5      0/2     Init:CrashLoopBackOff   70         18h
linkerd-destination-5d8f7bf64c-9bs5f      0/2     Init:CrashLoopBackOff   70         18h
linkerd-destination-5d8f7bf64c-n4fv8      0/2     Init:CrashLoopBackOff   70         18h
linkerd-grafana-7bfdd6f4bb-4cqfj          0/2     Init:CrashLoopBackOff   70         18h
linkerd-identity-5db687c589-qg8bd         0/2     Init:CrashLoopBackOff   70         18h
linkerd-identity-5db687c589-sdtqv         0/2     Init:CrashLoopBackOff   70         18h
linkerd-identity-5db687c589-vlcz9         0/2     Init:CrashLoopBackOff   70         18h
linkerd-prometheus-684649fb7c-k9ntk       0/2     Init:CrashLoopBackOff   70         18h
linkerd-proxy-injector-7f699b7fcd-4wzdq   0/2     Init:CrashLoopBackOff   70         18h
linkerd-proxy-injector-7f699b7fcd-ktm8g   0/2     Init:CrashLoopBackOff   70         18h
linkerd-proxy-injector-7f699b7fcd-sqgx4   0/2     Init:CrashLoopBackOff   70         18h
linkerd-sp-validator-7bd9fbcccf-6phtt     0/2     Init:CrashLoopBackOff   70         18h
linkerd-sp-validator-7bd9fbcccf-99kg5     0/2     Init:CrashLoopBackOff   70         18h
linkerd-sp-validator-7bd9fbcccf-99mpz     0/2     Init:CrashLoopBackOff   70         18h
linkerd-tap-647b845cc4-4qfc8              0/2     Init:CrashLoopBackOff   70         18h
linkerd-tap-647b845cc4-96sg5              0/2     Init:CrashLoopBackOff   70         18h
linkerd-tap-647b845cc4-c2qdc              0/2     Init:CrashLoopBackOff   70         18h
linkerd-web-76b8577569-bwcl8              0/2     Init:CrashLoopBackOff   70         18h

and logs:

~ ❯❯❯ kubectl logs -f linkerd-prometheus-684649fb7c-k9ntk -n linkerd linkerd-init
2019/11/07 05:34:17 Tracing this script execution as [1573104857]
2019/11/07 05:34:17 State of iptables rules before run:
2019/11/07 05:34:17 > iptables -t nat -vnL
2019/11/07 05:34:17 < Chain PREROUTING (policy ACCEPT 123 packets, 7380 bytes)
 pkts bytes target     prot opt in     out     source               destination
28364 1702K PROXY_INIT_REDIRECT  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* proxy-init/install-proxy-init-prerouting/1573037158 */                                  

Chain INPUT (policy ACCEPT 123 packets, 7380 bytes)
 pkts bytes target     prot opt in     out     source               destination                                                                                                               

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)                                                                                                                                               
 pkts bytes target     prot opt in     out     source               destination                                                                                                               
  604 39256 PROXY_INIT_OUTPUT  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* proxy-init/install-proxy-init-output/1573037158 */                                        

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)                                                                                                                                          
 pkts bytes target     prot opt in     out     source               destination                                                                                                               

Chain PROXY_INIT_OUTPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain PROXY_INIT_REDIRECT (1 references)
 pkts bytes target     prot opt in     out     source               destination         

2019/11/07 05:34:17 > iptables -t nat -F PROXY_INIT_REDIRECT
2019/11/07 05:34:17 < 
2019/11/07 05:34:17 > iptables -t nat -X PROXY_INIT_REDIRECT
2019/11/07 05:34:17 < iptables: Too many links.

2019/11/07 05:34:17 Will ignore port 4190 on chain PROXY_INIT_REDIRECT
2019/11/07 05:34:17 Will ignore port 4191 on chain PROXY_INIT_REDIRECT
2019/11/07 05:34:17 Will redirect all INPUT ports to proxy
2019/11/07 05:34:17 > iptables -t nat -F PROXY_INIT_OUTPUT
2019/11/07 05:34:17 < 
2019/11/07 05:34:17 > iptables -t nat -X PROXY_INIT_OUTPUT
2019/11/07 05:34:17 < iptables: Too many links.

2019/11/07 05:34:17 Ignoring uid 2102
2019/11/07 05:34:17 Will ignore port 443 on chain PROXY_INIT_OUTPUT
2019/11/07 05:34:17 Redirecting all OUTPUT to 4140
2019/11/07 05:34:17 Executing commands:
2019/11/07 05:34:17 > iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1573104857
2019/11/07 05:34:17 < iptables: Chain already exists.

2019/11/07 05:34:17 Aborting firewall configuration
Error: exit status 1
Usage:
  proxy-init [flags]

Flags:
  -h, --help                            help for proxy-init
      --inbound-ports-to-ignore ints    Inbound ports to ignore and not redirect to proxy. This has higher precedence than any other parameters.
  -p, --incoming-proxy-port int         Port to redirect incoming traffic (default -1)
      --netns string                    Optional network namespace in which to run the iptables commands
      --outbound-ports-to-ignore ints   Outbound ports to ignore and not redirect to proxy. This has higher precedence than any other parameters.
  -o, --outgoing-proxy-port int         Port to redirect outgoing traffic (default -1)
  -r, --ports-to-redirect ints          Port to redirect to proxy, if no port is specified then ALL ports are redirected
  -u, --proxy-uid int                   User ID that the proxy is running under. Any traffic coming from this user will be ignored to avoid infinite redirection loops. (default -1)
      --simulate                        Don't execute any command, just print what would be executed
  -w, --use-wait-flag                   Appends the "-w" flag to the iptables commands

samlabs821 on 7 Nov 2019

@samlabs821 Thanks for this information, I believe these are the relevant parts:

2019/11/07 05:34:17 > iptables -t nat -F PROXY_INIT_REDIRECT
2019/11/07 05:34:17 <
2019/11/07 05:34:17 > iptables -t nat -X PROXY_INIT_REDIRECT
2019/11/07 05:34:17 < iptables: Too many links.
...
2019/11/07 05:34:17 > iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1573104857
2019/11/07 05:34:17 < iptables: Chain already exists.

In the [iptables.go](https://github.com/linkerd/linkerd2-proxy-init/blob/master/iptables/iptables.go#L86) class, the addIncomingTrafficRules function attempts to flush and delete the PROXY_INIT_REDIRECT chain, then create a new PROXY_INIT_REDIRECT chain.

I suspect that the Chain already exists message is displayed because the chain is not delete due to the Too many links message that is written. If a chain has references, then iptables won't delete. In fact, we can see that the PREROUTING chain references PROXY_INIT_REDIRECT:

Chain PREROUTING (policy ACCEPT 123 packets, 7380 bytes)
 pkts bytes target     prot opt in     out     source               destination
28364 1702K PROXY_INIT_REDIRECT  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* proxy-init/install-proxy-init-prerouting/1573037158 */

We'll have to dig some more on this to figure out why that reference still exists in the PREROUTING chain.

In the meantime, I noticed that you are using a multi-master cluster. Have you been able to reproduce this with a single master?

I'm going to have to set up a test environment with Calico and multi-master, depending on whether you can reproduce this with a single master.

cpretzer on 7 Nov 2019

when I said clean install, i mean linkerd. My cluster was the same. I found that my OS is the same as @jurgengrech CentOS Linux release 7.5.1804 (Core) and hyperkube is used

@jurgengrech How you fixed the issue? any updates

samlabs821 on 13 Nov 2019

I builded init container with extended logs and able to reproduce again
```Tracing this script execution as [1574303985]

current state

:; iptables-save

Generated by iptables-save v1.6.0 on Thu Nov 21 02:39:45 2019

*mangle
:PREROUTING ACCEPT [252572:84112530]
:INPUT ACCEPT [252572:84112530]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [238790:90772525]
:POSTROUTING ACCEPT [238790:90772525]
COMMIT

Completed on Thu Nov 21 02:39:45 2019

Generated by iptables-save v1.6.0 on Thu Nov 21 02:39:45 2019

*raw
:PREROUTING ACCEPT [252572:84112530]
:OUTPUT ACCEPT [238790:90772525]
COMMIT

Completed on Thu Nov 21 02:39:45 2019

Generated by iptables-save v1.6.0 on Thu Nov 21 02:39:45 2019

*nat
:PREROUTING ACCEPT [122:7320]
:INPUT ACCEPT [122:7320]
:OUTPUT ACCEPT [3:180]
:POSTROUTING ACCEPT [3:180]
:PROXY_INIT_OUTPUT - [0:0]
:PROXY_INIT_REDIRECT - [0:0]
-A PREROUTING -m comment --comment "proxy-init/install-proxy-init-prerouting/1574237316" -j PROXY_INIT_REDIRECT
-A OUTPUT -m comment --comment "proxy-init/install-proxy-init-output/1574237316" -j PROXY_INIT_OUTPUT
COMMIT

Completed on Thu Nov 21 02:39:45 2019

Generated by iptables-save v1.6.0 on Thu Nov 21 02:39:45 2019

*filter
:INPUT ACCEPT [252572:84112530]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [238790:90772525]
COMMIT

Completed on Thu Nov 21 02:39:45 2019

cleanup

:; iptables -t nat -D OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-prerouting/1574303985
iptables: No chain/target/match by that name.

:; iptables -t nat -D PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting/1574303985
iptables: No chain/target/match by that name.

:; iptables -t nat -F PROXY_INIT_OUTPUT
:; iptables -t nat -X PROXY_INIT_OUTPUT
iptables: Too many links.

:; iptables -t nat -F PROXY_INIT_REDIRECT
:; iptables -t nat -X PROXY_INIT_REDIRECT
iptables: Too many links.

configuration

Will ignore port 4190 on chain PROXY_INIT_REDIRECT
Will ignore port 4191 on chain PROXY_INIT_REDIRECT
Will redirect all INPUT ports to proxy
Ignoring uid 2102
Will ignore port 443 on chain PROXY_INIT_OUTPUT
Redirecting all OUTPUT to 4140

adding rules

:; iptables -t nat -N PROXY_INIT_REDIRECT -m comment --comment proxy-init/redirect-common-chain/1574303985
Error: exit status 1
iptables: Chain already exists.

Aborting firewall configuration
Usage:
proxy-init [flags]

Flags:
-h, --help help for proxy-init
--inbound-ports-to-ignore ints Inbound ports to ignore and not redirect to proxy. This has higher precedence than any other parameters.
-p, --incoming-proxy-port int Port to redirect incoming traffic (default -1)
--netns string Optional network namespace in which to run the iptables commands
--outbound-ports-to-ignore ints Outbound ports to ignore and not redirect to proxy. This has higher precedence than any other parameters.
-o, --outgoing-proxy-port int Port to redirect outgoing traffic (default -1)
-r, --ports-to-redirect ints Port to redirect to proxy, if no port is specified then ALL ports are redirected
-u, --proxy-uid int User ID that the proxy is running under. Any traffic coming from this user will be ignored to avoid infinite redirection loops. (default -1)
--simulate Don't execute any command, just print what would be executed
-w, --use-wait-flag Appends the "-w" flag to the iptables commands
```

samlabs821 on 21 Nov 2019

@samlabs821 thanks for this!

We've got a pull request I hope will address this: https://github.com/linkerd/linkerd2-proxy-init/pull/4

I plan to test the changes soon to merge them into the next linkerd edge and get this resolved

cpretzer on 21 Nov 2019

Yeap, i am testing his PR right now)if you look carefully at the logs

samlabs821 on 21 Nov 2019

@samlabs821 I totally missed that you had built the container with the changes in the that PR that's super helpful

After reading through the PR, we expect that the PROXY_INIT_REDIRECT chain will have been flushed and deleted, but the output shows that both the PROXY_INIT_OUTPUT and PROXY_INIT_REDIRECT chains still have links.

I'll have a look into the command that will tell us which links still reference the chain. If we can add that in to the code, it will help to handle the case where the chains are not deleted.

What do you think @grampelberg ?

cpretzer on 21 Nov 2019

:; iptables -t nat -D OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-prerouting/1574303985

Should have deleted the chains. I must be missing something about iptables and need to take a look again.

Is there anything special with this cluster that you can think of? My understanding is that init containers only run once, when the pod is first scheduled/started. In that state, the network namespace should be completely empty (but has two rules already there, obviously from a previous run). I'm starting to think that we should be spending more time looking at why the pods are restarting after 24 hours as the iptables issue might just be a symptom of something more fundamentally wrong with the cluster's configuration.

grampelberg on 21 Nov 2019

found similar issue in istio : https://github.com/istio/istio/issues/16768

samlabs821 on 22 Nov 2019

Thank you guys, think I found problem, that my kube nodes were cleaning unused images every night) But can we fix anyway the problem with cleaning old firewall rules?

~ crontab -l 0 0 * * * yes | docker system prune -a --volumes

samlabs821 on 22 Nov 2019

🚀3

Yup! https://github.com/linkerd/linkerd2-proxy-init/pull/4 needs a little more tweaking.

grampelberg on 22 Nov 2019

found similar issue in istio : istio/istio#16768

That points to kubernetes/kubernetes#67261 whose cause was indeed having an external process cleanup containers instead of having kubelet do its GC, which isn't recommended.

I also agree with that Istio issue, where deletion of existing rules is avoided to avoid downtime when recreating the rules; if rules are detected to be there already then we could just leave them alone.

alpeb on 25 Nov 2019

@alpeb I'm reasonably sure that the rules I've been seeing would result in a broken install. @adleong and I think that erroring out hard is a better solution. That way folks know they're doing something that they shouldn't be.

grampelberg on 25 Nov 2019

👍1

I think we probably could just leave the rules alone if we detected them and things would "just work" but...

erroring out can be helpful for detecting a larger problem: why is the init-container running more than once?
it introduces some weird edge cases: what if there are rules that look like the Linkerd rules? what if there are older Linkerd rules from a previous version? failing fast at the first sign of trouble is the safest thing to do from a debugging standpoint.

adleong on 26 Nov 2019

do we have workaround solution while we are waiting for long term solution guys?

ptualek on 26 Nov 2019

@ptualek the long term solution is fixing the underlying problem - letting the kubelet do garbage collection instead of a separate script. We can't fix this problem from a linkerd perspective.

grampelberg on 26 Nov 2019

@ptualek sorry that we can't do anything from a linkerd perspective.

I'm going to close this for now; please reopen or submit a new issue if you want to add anything

cpretzer on 6 Dec 2019

👍1

@samlabs821
Thank you guys, think I found problem, that my kube nodes were cleaning unused images every night) But can we fix anyway the problem with cleaning old firewall rules?

I think we have faced the same issue immediately after enabling a cron command with docker system prune inside. Could someone explain why is this happening? How the problem is related to cleaned docker resources? Is there a Docker network or something that must not be cleaned?

kivagant-ba on 16 Dec 2019

@kivagant-ba someone referenced an issue in k8s : https://github.com/kubernetes/kubernetes/issues/67261
it means docker images should not be cleaned externally, otherwise containers will keep running after cleanup

samlabs821 on 16 Dec 2019

andyzhangx commented on Dec 11, 2018
close this issue since it's already resolved

Thank you, @samlabs821 .

Do you know by a chance what Kubernetes version includes the fix?

kivagant-ba on 16 Dec 2019

@kivagant-ba looking through the kubernetes issue, it looks like it wasn't fixed.

My take after reading the issue is that kubelet should be relied upon to remove terminated containers. Said another way, when a process external to kubelet (i.e. docker prune) removes a terminated init-container, then this behavior can occur.

@jurgengrech @pinkertonpg @samlabs821, can you all let us know whether you have external process that remove terminated containers from your clusters?

cpretzer on 29 Dec 2019

I had a hunch at some point that this might be related to docker running with live-restore.
Never really did completely rule that out :|