Describe the bug
I set up k3s via services.k3s.enable = true. DNS was not working in the cluster.
To further debug, I followed the instructions on: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
With the following results:
# k3s kubectl exec -ti dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5
# k3s kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; reply from unexpected source: 10.42.0.3#53, expected 10.43.0.10#53
;; reply from unexpected source: 10.42.0.3#53, expected 10.43.0.10#53
;; reply from unexpected source: 10.42.0.3#53, expected 10.43.0.10#53
;; connection timed out; no servers could be reached
command terminated with exit code 1
Additional context
The host machine uses a static ip address configuratio with dns servers "8.8.8.8" and "8.8.4.4"
Notify maintainers
Maintainer information:
# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:
- k3s
- services.k3s
There's actually a couple of issues with networking and the k3s package.
The issue reported above, which is solved by modprobe br_netfilter. This needs to be added to boot.kernelModules.
If the firewall is off, ip_conntrack is not automatically loaded. k3s tries to activate it but can fail. @DavHau provided a fix in #98743. (It might also be useful to add iptables to the list of dependencies, as having the iptables utility makes networking much easier to debug, and if the firewall is disabled, it is uninstalled.)
The k3s logs have messages about kube-proxy wanting to activate ip_vs, ip_vs_rr, ip_vs_wrr, and ip_vs_sh, with messages similar to Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules. I activated these just in case, as k3s does not run kube-proxy inside a container.
Finally, when I tried to use the prebuilt k3s binary, I also had to modprobe overlay, as containerd depended on it. I think this isn't an issue here, as containerd seems to correctly activate it, but I added it to my boot.kernelModules regardless, just in case.
The k3s derivation as a whole and works nicely and is much better than naively using the prebuilt k3s binary. There are just a couple edge cases with networking that mostly boil down to kernel module issues.
/cc @euank
I can confirm that 1 is a resolution to the issue described here. I ran into this a couple of weeks ago when building out a multinode k3s cluster.
Ultimately, I also had no networking between pods across nodes with vxlan and the host-gw flannel backends which led me to scrap my cluster. I also saw the module error in 3 and wonder if this was my original issue.
Adding br_netfilter to the boot.kernelModules does not seem to resolve it for me unfortulately, while the issue seems to be quite familiar. It seems that for some reason, there is no inter-pod communication possible, which seems quite weird to me, since I haven't done anything special in my configuration. If more information could help, please ask. I've been trying for quite a bit, but haven't found a solution anywhere, so I think I need some help. I'm on the 20.09 channel by the way.
UPDATE: I think it has something to do with the firewall rules, after disabling the firewall, it seems to work. So for people wondering why it won't work: Try setting networking.firewall.enable=false; and letting the cluster start up. If all pods in kube-system are running, the firewall can be enabled again. Still, this might be something to look at, it seems that there are some rules added to the firewall during startup by k3s which don't allow pods from connecting to each other. For documentation purposes I've added the incorrect firewall rules which blocked pods from connecting at the bottom of this comment.
I have error messages of the same format:
$ k3s kubectl exec -ti dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5
$ k3s kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
md5-37c26ffcab0ec8a0f26fc2147815d155
$ kubectl logs --namespace=kube-system -l k8s-app=kube-dns
E1024 10:32:57.848690 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Service: Get https://10.43.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
E1024 10:32:57.848690 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Service: Get https://10.43.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
E1024 10:32:57.848690 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Service: Get https://10.43.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
I1024 10:32:57.850469 1 trace.go:82] Trace[1556590182]: "Reflector pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98 ListAndWatch" (started: 2020-10-24 10:32:27.849799857 +0000 UTC m=+69691.612882863) (total time: 30.000607943s):
Trace[1556590182]: [30.000607943s] [30.000607943s] END
E1024 10:32:57.850498 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
E1024 10:32:57.850498 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
E1024 10:32:57.850498 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
E1024 10:32:57.850498 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
md5-7bc9055817299edfeef942211c635463
$ kubectl logs -n kube-system -f helm-install-traefik-t5dw2
CHART=$(sed -e "s/%{KUBERNETES_API}%/${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/g" <<< "${CHART}")
set +v -x
+ cp /var/run/secrets/kubernetes.io/serviceaccount/ca.crt /usr/local/share/ca-certificates/
+ update-ca-certificates
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
+ export HELM_HOST=127.0.0.1:44134
+ HELM_HOST=127.0.0.1:44134
+ helm_v2 + init --skip-refresh --client-only
tiller --listen=127.0.0.1:44134 --storage=secret
Creating /root/.helm
Creating /root/.helm/repository
Creating /root/.helm/repository/cache
Creating /root/.helm/repository/local
Creating /root/.helm/plugins
Creating /root/.helm/starters
Creating /root/.helm/cache/archive
Creating /root/.helm/repository/repositories.yaml
Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
Adding local repo with URL: http://127.0.0.1:8879/charts
$HELM_HOME has been configured at /root/.helm.
Not installing Tiller due to 'client-only' flag having been set
Happy Helming!
++ ++ helm_v2 ls --all '^traefik$' jq -r '.Releases | length'
--output json
[main] 2020/10/24 10:40:01 Starting Tiller v2.12.3 (tls=false)
[main] 2020/10/24 10:40:01 GRPC listening on 127.0.0.1:44134
[main] 2020/10/24 10:40:01 Probes listening on :44135
[main] 2020/10/24 10:40:01 Storage driver is Secret
[main] 2020/10/24 10:40:01 Max history per release is 0
[storage] 2020/10/24 10:40:01 listing all releases with filter
[storage/driver] 2020/10/24 10:40:31 list: failed to list: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%3DTILLER: dial tcp 10.43.0.1:443: i/o timeout
Error: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%!D(MISSING)TILLER: dial tcp 10.43.0.1:443: i/o timeout
+ EXIST=
+ '[' '' == 1 ']'
+ '[' '' == v2 ']'
+ helm_repo_init
+ grep -q -e 'https\?://'
chart path is a url, skipping repo update
+ echo 'chart path is a url, skipping repo update'
+ helm_v3 repo remove stable
Error: no repositories configured
+ true
+ return
+ helm_update install
+ '[' helm_v3 == helm_v3 ']'
++ helm_v3 ls --all-namespaces --all -f '^traefik$' --output json
++ tr ++ jq -r '[:upper:]' '[:lower:]'
'"\(.[0].app_version),\(.[0].status)"'
+ LINE=null,null
++ echo null,null
++ cut -f1 -d,
+ INSTALLED_VERSION=null
++ echo null,null
++ cut -f2 -d,
+ STATUS=null
+ '[' -e /config/values.yaml ']'
+ VALUES='--values /config/values.yaml'
+ '[' install = delete ']'
+ '[' -z null ']'
+ '[' null = deployed ']'
+ '[' null = failed ']'
+ '[' null = deleted ']'
+ helm_v3 install traefik https://10.43.0.1:443/static/charts/traefik-1.81.0.tgz --values /config/values.yaml
Error: failed to download "https://10.43.0.1:443/static/charts/traefik-1.81.0.tgz" (hint: running `helm repo update` may help)
md5-7c214844cb240a39d427fc5d173442ea
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
nixos-fw all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- anywhere anywhere /* kubernetes forwarding rules */
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
ACCEPT all -- nixos-server/16 anywhere
ACCEPT all -- anywhere nixos-server/16
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain KUBE-EXTERNAL-SERVICES (1 references)
target prot opt source destination
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- anywhere anywhere /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
Chain KUBE-FORWARD (1 references)
target prot opt source destination
DROP all -- anywhere anywhere ctstate INVALID
ACCEPT all -- anywhere anywhere /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain KUBE-PROXY-CANARY (0 references)
target prot opt source destination
Chain KUBE-SERVICES (3 references)
target prot opt source destination
REJECT tcp -- anywhere 10.43.0.10 /* kube-system/kube-dns:dns-tcp has no endpoints */ tcp dpt:domain reject-with icmp-port-unreachable
REJECT tcp -- anywhere 10.43.0.10 /* kube-system/kube-dns:metrics has no endpoints */ tcp dpt:9153 reject-with icmp-port-unreachable
REJECT udp -- anywhere 10.43.0.10 /* kube-system/kube-dns:dns has no endpoints */ udp dpt:domain reject-with icmp-port-unreachable
Chain nixos-fw (1 references)
target prot opt source destination
nixos-fw-accept all -- anywhere anywhere
nixos-fw-accept all -- anywhere anywhere ctstate RELATED,ESTABLISHED
nixos-fw-accept tcp -- anywhere anywhere tcp dpt:ssh
nixos-fw-accept udp -- anywhere anywhere udp dpt:ssh
nixos-fw-accept udp -- anywhere anywhere udp dpt:51820
nixos-fw-accept icmp -- anywhere anywhere icmp echo-request
nixos-fw-log-refuse all -- anywhere anywhere
Chain nixos-fw-accept (6 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
Chain nixos-fw-log-refuse (1 references)
target prot opt source destination
LOG tcp -- anywhere anywhere tcp flags:FIN,SYN,RST,ACK/SYN LOG level info prefix "refused connection: "
nixos-fw-refuse all -- anywhere anywhere PKTTYPE != unicast
nixos-fw-refuse all -- anywhere anywhere
Chain nixos-fw-refuse (2 references)
target prot opt source destination
DROP all -- anywhere anywhere
@martijnjanssen k3s specifies a list of ports that need to be opened, and I don't think they're defined in the k3s module. Chain nixos-fw does not include accept rules for UDP 8472, which is the port used for in-cluster communication, so that's probably why disabling the firewall rule fixes it. If doing that solves your issue, we should probably also add those to the k3s module. (Maybe we should open another more general issue to try and work out the kinks in k3s)
@aditsachde Thanks for the response! I actually don't think opening ports by default is smart (and not the fix after all), since these are actually clearly defined in the k3s readme. What fixed it for me was adding:
networking.firewall.extraCommands = ''
iptables -I INPUT 3 -s 10.42.0.0/16 -j ACCEPT
iptables -I INPUT 3 -d 10.42.0.0/16 -j ACCEPT
'';
I found this solution in k3s issues where users were facing the same issue https://github.com/rancher/k3s/issues/977#issuecomment-552504848. I've checked, and without changing anything else, this is the fix for inter-pod communication. To check which ip to include in the route, the ip route command can help you determine which one it is. Here the cni0 device is the k3s interface (I think, I am not very experienced with networking yet)
$ ip route
default via 192.168.1.1 dev enp1s0 proto dhcp src 192.168.1.3 metric 202
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
192.168.1.0/24 dev enp1s0 proto dhcp scope link src 192.168.1.3 metric 202
I think the kernel-module related issues you mention above, including the overlay one, might be fixed by #101744 @aditsachde. Thanks for the detailed info on those module issues and for the ping!
I'm optimistic that those kernel module issues were the reason coredns couldn't start up, and thus were thus the root cause of this issue.
The firewall stuff is unfortunately more complicated.
Part of the problem is that k3s has so many knobs (i.e. the cluster cidr can be configured, you can use any of several overlay networks, etc).
Another part of the problem is that k3s defaults to flannel vxlan, but opening up the vxlan port (udp 8472) is insecure unless you trust your network, so we can't exactly default to that either.
We can definitely still improve the usability of the nixos module, even if we don't have a good secure default.
@martijnjanssen Thank you for linking that issue. I was still having some networking issues, but they were occurring on an Ubuntu-based cluster as well so I just assumed it was because of something on my network. Adding those rules fixed it.
@euank When I get a chance sometime later, I'll test out #101744 and see if kernel modules work out of the box. I don't think we can possibly cover all firewall details, but we can probably assume that if someone is configuring the cidr or swapping out the overlay network, they are capable of configuring the firewall. I do agree that opening the firewall by default is probably a bad idea. However, an openFirewall option that defaults to false like many other modules have might be a decent compromise.
I've been having some different issues with networking, where all requests are resulting in a socket error or i/o timeouts, seemingly across all pods, but for sure with coredns and ingress-nginx, both on k3s on unstable, as well as #101744, so I can't say whether or not the PR solves the kernel issues.
I've managed to have tons of networking issues with K3s on Ubuntu, as well as NixOS, with only K3OS working reliability. I'd love to get to the bottom of it, but I'm really not sure where to start debugging.
I've fingers crossed got k3s in a state where it seems to be working properly on 20.09, here's the module I'm using:
{ lib, ... }:
{
services.k3s = {
enable = true;
extraFlags = "--no-deploy traefik";
};
# https://github.com/NixOS/nixpkgs/issues/103158
systemd.services.k3s.after = [ "network-online.service" "firewall.service" ];
systemd.services.k3s.serviceConfig.KillMode = lib.mkForce "control-group";
# https://github.com/NixOS/nixpkgs/issues/98766
boot.kernelModules = [ "br_netfilter" "ip_conntrack" "ip_vs" "ip_vs_rr" "ip_vs_wrr" "ip_vs_sh" "overlay" ];
networking.firewall.extraCommands = ''
iptables -A INPUT -i cni+ -j ACCEPT
'';
}
Of particular note, since I'm starting k3s after the firewall I can use a simple -A INPUT instead of -I INPUT 3, and I'm matching on interface prefix instead of subnet mask, so it should be less fragile.
Most helpful comment
There's actually a couple of issues with networking and the k3s package.
The issue reported above, which is solved by
modprobe br_netfilter. This needs to be added toboot.kernelModules.If the firewall is off,
ip_conntrackis not automatically loaded. k3s tries to activate it but can fail. @DavHau provided a fix in #98743. (It might also be useful to addiptablesto the list of dependencies, as having the iptables utility makes networking much easier to debug, and if the firewall is disabled, it is uninstalled.)The k3s logs have messages about kube-proxy wanting to activate
ip_vs,ip_vs_rr,ip_vs_wrr, andip_vs_sh, with messages similar toFailed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules. I activated these just in case, as k3s does not run kube-proxy inside a container.Finally, when I tried to use the prebuilt k3s binary, I also had to
modprobe overlay, as containerd depended on it. I think this isn't an issue here, as containerd seems to correctly activate it, but I added it to myboot.kernelModulesregardless, just in case.The k3s derivation as a whole and works nicely and is much better than naively using the prebuilt k3s binary. There are just a couple edge cases with networking that mostly boil down to kernel module issues.
/cc @euank