Kubespray: enabling net.bridge.bridge-nf-call-iptables not working on centos 7

Created on 3 Nov 2017  路  6Comments  路  Source: kubernetes-sigs/kubespray

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Environment:

  • Cloud provider or hardware configuration:
    hardware configuration
  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 3.10.0-514.16.1.el7.x86_64 x86_64
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    CENTOS_MANTISBT_PROJECT="CentOS-7"
    CENTOS_MANTISBT_PROJECT_VERSION="7"
    REDHAT_SUPPORT_PRODUCT="centos"
    REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Version of Ansible (ansible --version):
    ansible 2.4.1.0
    ansible python module location = /usr/lib/python2.7/site-packages/ansible
    python version = 2.7.5 (default, Aug 4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]

Kubespray version (commit) (git rev-parse --short HEAD):
ba0a03a

Network plugin used:
flannel_backend_type: host-gw

Copy of your inventory file:

Command used to invoke ansible:

Output of ansible run:

Anything else do we need to know:

The issue:
Task "name: Enable bridge-nf-call tables" in roles/kubernetes/node/tasks/main.yml doesn't execute on my "default" centos 7.4 install.
This breaks any bridge-based kubernetes networking with iptables kube-proxy, flannel in my case.

This can be fixed by replacing the conditions for task execution, modinfo_br_netfilter.rc == 1 in particular. Better fix might be: have a variable to enable sysctl or detect network backends that require it automatically.
I'd rather kubespray fail if it can't load br_netfilter or set net.bridge.bridge-nf-call-iptables vs current situation when kubespray succeeds but kubernetes network is broken.

Most helpful comment

try to modprobe br_netfilter in your centos, then sysctl should work

All 6 comments

We're seeing this exact same issue on our CentOS 7.3 implementation as well. The issue is with the line of code you mentioned located at:
https://github.com/kubernetes-incubator/kubespray/blob/a6975c18506f533ba9e67c7fb233e9898b77502b/roles/kubernetes/node/tasks/main.yml#L101
modinfo_br_netfilter.rc == 1 needs to be changed to modinfo_br_netfilter.rc == 0 to function correctly.

For us, this initially presented itself as cluster DNS issues since Pods were unable to perform lookups on kube-dns running on the same host. Since the bridge netfilter isn't being correctly enabled, it results in Pods being unable to communicate with other Pods on the same host.

This is a breaking bug for anyone using CentOS 7 and Flannel.

@paravz @nickkeyzer @Atoms Can you explain the resolution to this issue for me?
I'm having the issue with CentOS 7 master where the WS2019 worker pods can't connect to the kube-dns unless firewalld is stopped.

The recommended sudo sysctl net.bridge.bridge-nf-call-iptables=1 command fails with sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory.
I tried editing /etc/sysctl.conf as per https://github.com/kubernetes/kubernetes/issues/33798#issuecomment-251661435 and rebooting but it's still not working. As soon as I sudo systemctl stop firewalld the resolution starts working. Adding DNS service and 53/tcp and 53/udp to the firewall zone had no effect either.

try to modprobe br_netfilter in your centos, then sysctl should work

try to modprobe br_netfilter in your centos, then sysctl should work

I did and the command indicates it succeeds but it doesn't seem to add the item to the /etc/sysctl.conf so I ended up modifying the file directly. It also didn't resolve my underlying issue either: my worker pods can't access kube-dns unless firewalld is disabled on the master.

Worth noting, not sure if required but, I created an /etc/modules-load.d/br_netfilter.conf to load the module at boot as well.

I'm not using kubespray, I used kubeadm, so this is out of topic. If you have any suggestions however, I'd appreciate it. I created an MS forum thread with more details.

Still experiencing this in CentOs 7.7 and kubespray v1.12.1, running ansible with the following command:

ansible-playbook test.yml -b -u centos -i <hostfile> --private-key=<key-file>


The issue we see is related the following ansible task:
https://github.com/kubernetes-sigs/kubespray/blob/v2.12.1/roles/kubernetes/node/tasks/main.yml#L73

# kube-proxy needs net.bridge.bridge-nf-call-iptables enabled when found if br_netfilter is not a module
    - name: Check if bridge-nf-call-iptables key exists
      command: "sysctl net.bridge.bridge-nf-call-iptables"
      failed_when: false
      changed_when: false
      register: sysctl_bridge_nf_call_iptables


Running this task returns the following (output for single host):

ok: [p3k8stestwk003] => {
    "msg": {
        "changed": false,
        "cmd": "sysctl net.bridge.bridge-nf-call-iptables",
        "failed": false,
        "failed_when_result": false,
        "msg": "[Errno 2] No such file or directory",
        "rc": 2
    }
}


Looking above, the rc key is 2, which then does not trigger the following task:
https://github.com/kubernetes-sigs/kubespray/blob/v2.12.1/roles/kubernetes/node/tasks/main.yml#L80

  sysctl:
    name: "{{ item }}"
    state: present
    sysctl_file: "{{ sysctl_file_path }}"
    value: "1"
    reload: yes
  when: sysctl_bridge_nf_call_iptables.rc == 0
  with_items:
    - net.bridge.bridge-nf-call-iptables
    - net.bridge.bridge-nf-call-arptables
    - net.bridge.bridge-nf-call-ip6tables




Was able to get a valid response by changing the command in the above from:
sysctl net.bridge.bridge-nf-call-iptables
to
/usr/sbin/sysctl net.bridge.bridge-nf-call-iptables

ok: [p3k8stestwk003] => {
    "msg": {
        "changed": false,
        "cmd": [
            "/usr/sbin/sysctl",
            "net.bridge.bridge-nf-call-iptables"
        ],
        "delta": "0:00:00.008156",
        "end": "2020-03-26 11:58:25.959879",
        "failed": false,
        "failed_when_result": false,
        "rc": 0,
        "start": "2020-03-26 11:58:25.951723",
        "stderr": "",
        "stderr_lines": [],
        "stdout": "net.bridge.bridge-nf-call-iptables = 0",
        "stdout_lines": [
            "net.bridge.bridge-nf-call-iptables = 0"
        ]
    }
}

Not a huge linux user, so not sure if the location /usr/sbin/sysctl is CentOS only, but happy to take a stab at a pull request if that would help.

Happens on centos 7.6; okay on centOS 7.7 - Lesson - use an updated OS when installing kubernetes

Was this page helpful?
0 / 5 - 0 ratings