UPDATE as of Feb 7th, 2018 by request of @bboreham I've edited the title so to not mislead people looking for unrelated issue.
When I deploy some demo application, I have the same message as above. (Error syncing pod, skipping: failed to "SetupNetwork" ).
When I check the logs of the proxy pod, kubectl logs kube-proxy-g7qh1 --namespace=kube-system I get the following info: proxier.go:254] clusterCIDR not specified, unable to distinguish between internal and external traffic
@damaspi I have opened this issue and provided a fix. Waiting on feedback!
Also, moving to userspace mode brings quite of a performance penalty.
Sorry, I commented in the wrong issue.
Thanks for the fix. I'll not be able to test it soon though (was working on this during holidays, and I am back to work), and I was using only the official stable version (so I have not the environment to build it).
I copied it here now, and delete it in the other...
I worked-around temporarily by configuring proxy-mode to userspace but any advice welcome...
(inspired by this issue )
kubectl -n kube-system get ds -l "component=kube-proxy" -o json | jq ".items[0].spec.template.spec.containers[0].command |= .+ [\"--proxy-mode=userspace\"]" | kubectl apply -f - && kubectl -n kube-system delete pods -l "component=kube-proxy"
Again, @damaspi
Also, moving to userspace mode brings quite of a performance penalty.
I had the same issue.
My Kube-Proxy would not install the Service related rules, making any service unavailable from the pods.
My fix was to modify the Kubeadm DaemonSet for kube-proxy and add explicitely the --cluster-cidr = option.
/cc @luxas
@spxtr you are closing a bunch of issues in this repo
@mikedanese PRs being merged and there was a PR merged that fixed the lack of --cluster-cidr flag in controller-manager.
@pires, the merge of the PR in the main repo is not what closed this PR. It was the merge in @spxtr's branch. That's what concerns me.
Ah I've seen it before indeed.
I have seen this on 1.5.2. I manually building a cluster (to learn.) . I am unclear what the fix is, as there is mention of controller-manager and daemon set. That implies to me that people are launching kube-proxy via a daemon-set. Just to clarify, the actual fix is to add the flag (--cluster-cidr) to kube-proxy correct? Just trying to make sure I am not missing something. Also, just to clear my memory, didn't kube-proxy use to get this from the kube-apiserver? Was it always needed, I can't remember. If it doesn't, can someone clarify the difference between --service-cluster-ip-range=10.0.0.0/16 (api) and --cluster-cidr (proxy)? Thanks. (sorry to add here, not sure where else to ask for this issue.)
Where did the API server exposed the cluster pod CIDR? This was a misconception on my side as well.
Hi @pires, I thought . --service-cluster-ip-range=10.0.0.0/16 on the api-server set it all up as the proxies would talk to the k8s server to get that information. --cluster-cidr maybe was to do a subset of --service-cluster-ip-range, else it seems redundant or there is a use case that I am unclear about (or I just don't know what I am talking about, which could be true!)
Service CIDR is the subnet used for virtual IPs (used by kube-proxy). Problem is kube-proxy doesn't know about pod network CIDR, which is different than service CIDR.
Ah, so would that be the overlay?
Would this issue cause communication between pod and api-server? For example if I was to run the curl command from a kube pod to apiserver "curl https://10.96.0.1:443/api" result:> curl: (7) Failed to connect to 10.96.0.1 port 443: Connection timed out...
@bamb00, yes that symptom is caused by this bug.
For those interested, what's happening is that without knowledge of the cluster subnet, kube-proxy can't generate iptables conditions to match external traffic. Without those conditions, the traffic doesn't get marked for SNAT and gets put on the wire with the correct destination address but incorrect source.
Demonstration of the missing rules:
--- /root/ipt.old 2017-02-22 09:26:48.666151853 +0000
+++ /root/ipt.new 2017-02-22 09:25:52.010151853 +0000
@@ -27,8 +27,11 @@
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-EHDRCCD3XO3VA5ZU -s 192.168.1.4/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-EHDRCCD3XO3VA5ZU -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-EHDRCCD3XO3VA5ZU --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 192.168.1.4:6443
+-A KUBE-SERVICES ! -s 10.32.0.0/12 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
+-A KUBE-SERVICES ! -s 10.32.0.0/12 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
+-A KUBE-SERVICES ! -s 10.32.0.0/12 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-EHDRCCD3XO3VA5ZU --mask 255.255.255.255 --rsource -j KUBE-SEP-EHDRCCD3XO3VA5ZU
@@ -72,4 +75,6 @@
-A WEAVE-NPC -d 224.0.0.0/4 -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
+-A WEAVE-NPC-DEFAULT -m set --match-set weave-k?Z;25^M}|1s7P3|H9i;*;MhG dst -j ACCEPT
+-A WEAVE-NPC-DEFAULT -m set --match-set weave-iuZcey(5DeXbzgRFs8Szo]<@p dst -j ACCEPT
COMMIT
This can be fixed at runtime by modifying @damaspi's command from above, replacing --proxy-mode=userspace with --cluster-cidr=your_cidr
Currently building kubeadm with the merged patch, will re-bootstrap with that and report back on it's success.
@predakanga, Thanks for responding and explanation. I'm struggling to understand a connection timed out to the apiserver from a pod. What puzzling to me is the timed out error occurs on pod running in node on non-master (AWS) and pod running on master node does not have the timed out error. I want to apply the suggested workaround but have a question on how do I get the value your_cidr for --cluster-cidr?
Workaround:
kubectl -n kube-system get ds -l "component=kube-proxy" -o json | jq '.items[0].spec.template.spec.containers[0].command |= .+ [\"--cluster-cidr=your_cidr\"]' | kubectl apply -f - && kubectl -n kube-system delete pods -l "component=kube-proxy"
Here is the timed out log:
2017-02-22T16:23:44.200770003Z 2017-02-22 16:23:44 +0000 [info]: starting fluentd-0.12.31
2017-02-22T16:23:44.281836006Z 2017-02-22 16:23:44 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '1.9.2'
2017-02-22T16:23:44.281862309Z 2017-02-22 16:23:44 +0000 [info]: gem 'fluent-plugin-journal-parser' version '0.1.0'
2017-02-22T16:23:44.281867643Z 2017-02-22 16:23:44 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '0.26.2'
2017-02-22T16:23:44.281873256Z 2017-02-22 16:23:44 +0000 [info]: gem 'fluent-plugin-record-reformer' version '0.8.3'
2017-02-22T16:23:44.281876742Z 2017-02-22 16:23:44 +0000 [info]: gem 'fluentd' version '0.12.31'
2017-02-22T16:23:44.281976520Z 2017-02-22 16:23:44 +0000 [info]: adding filter pattern="kubernetes." type="kubernetes_metadata"
2017-02-22T16:24:44.639919409Z 2017-02-22 16:24:44 +0000 **[error]: config error file="/fluentd/etc/fluent.conf" error="Invalid Kubernetes API v1 endpoint https://10.96.0.1:443/api: Timed out connecting to server"
2017-02-22T16:24:44.641926923Z 2017-02-22 16:24:44 +0000 [info]: process finished code=256
2017-02-22T16:24:44.641936546Z 2017-02-22 16:24:44 +0000 [error]: fluentd main process died unexpectedly. restarting.
As you can see the timed out is pointing to https://10.96.0.1:443/api, but from kubernetes service the apiserver endpoints is 10.43.0.20:6443. From what I understand from your explanation the timed out error is kube-proxy can't generate iptables condition to match external traffic.
Why is the connection going through 10.96.0.1:443 and not the endpoints 10.43.0.20:6443?
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Selector:
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
Endpoints: 10.43.0.20:6443
Session Affinity: ClientIP
No events.
Update: Apply the workaround fixes the "clusterCIDR not specified, unable to distinguish between internal and external traffic".
Thanks.
Sorry for the newbie question, but how do I get the fix to apply it permanently?
We're using kube-proxy v1.5.3 (gcr.io/google_containers/kube-proxy-amd64:v1.5.3) but still seeing the error "clusterCIDR not specified, unable to distinguish between internal and external traffic".
According to the URL below the fix is in for kube-proxy v1.5,
https://github.com/dchen1107/kubernetes-1/commit/9dedf92d42028e1bbb4d6aae66b353697afaa55b
Is this correct?
@bamb00 this is not a kube-proxy fix but a kubeadm change that sets a flag in the kube-proxy pod manifest. kubeadm doesn't follow (yet!) the kubernetes release process, so there's no kubeadm 1.5.3. There will be a 1.6.
I deploy 1.5.2.some warning hint. proxy: clusterCIDR not specified, unable to distinguish between internal and external traffic
@timchenxiaoyu With v1.6 you can set the --pod-network-cidr flag to set that
Hi, I work on Weave Net, and I cannot understand why a change would be necessary as described at https://github.com/kubernetes/kubeadm/issues/102#issuecomment-281617189
Weave Net installs its own masquerading rules on exit from the pod network, so should not need kube-proxy to do it too.
@bboreham I can't speak to the why, but from memory the issue was that the weave-net daemonset pods couldn't talk to each other.
I'll re-up my environment so I can give you more details, but that may take a while (Australian internet)
Can't see why this issue should be closed - could a maintainer please re-open it?
/cc @luxas
I've just checked a working Kubernetes+WeaveNet cluster, and it has the same message in kube-proxy's logs
W0404 12:24:17.175005 1 proxier.go:309] clusterCIDR not specified, unable to distinguish between internal and external traffic
So I would conclude that the warning is unnecessarily scaring people.
the issue was that the weave-net daemonset pods couldn't talk to each other.
@predakanga the Weave Net implementation runs in the host network namespace; it definitely shouldn't be impacted by (or going anywhere near) kube-proxy rules when the pods talk to each other.
@bboreham Ah, I was misremembering, my apologies.
I think I know what the actual issue was, but I'll hold off until two of these nodes finish bootstrapping so I can confirm
@bboreham The issue is that the weave pod is trying to reach the API server through it's VIP
I've just bootstrapped a two-node cluster with no special options and https://git.io/weave-kube-1.6 applied, and reach an error condition.
Syslog output:
Apr 4 13:48:45 frontend kubelet[17891]: I0404 13:48:45.775425 17891 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/a96a9cfb-193b-11e7-b8e0-02fc636bbb90-weave-net-token-p3qrx" (spec.Name: "weave-net-token-p3qrx") pod "a96a9cfb-193b-11e7-b8e0-02fc636bbb90" (UID: "a96a9cfb-193b-11e7-b8e0-02fc636bbb90").
Apr 4 13:48:45 frontend kubelet[17891]: I0404 13:48:45.984975 17891 kuberuntime_manager.go:458] Container {Name:weave Image:weaveworks/weave-kube:1.9.4 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:10 scale:-3} d:{Dec:<nil>} s:10m Format:DecimalSI}]} VolumeMounts:[{Name:weavedb ReadOnly:false MountPath:/weavedb SubPath:} {Name:cni-bin ReadOnly:false MountPath:/host/opt SubPath:} {Name:cni-bin2 ReadOnly:false MountPath:/host/home SubPath:} {Name:cni-conf ReadOnly:false MountPath:/host/etc SubPath:} {Name:dbus ReadOnly:false MountPath:/host/var/lib/dbus SubPath:} {Name:lib-modules ReadOnly:false MountPath:/lib/modules SubPath:} {Name:weave-net-token-p3qrx ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/status,Port:6784,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:30,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Apr 4 13:48:45 frontend kubelet[17891]: I0404 13:48:45.986470 17891 kuberuntime_manager.go:742] checking backoff for container "weave" in pod "weave-net-x18xm_kube-system(a96a9cfb-193b-11e7-b8e0-02fc636bbb90)"
Apr 4 13:48:45 frontend kubelet[17891]: I0404 13:48:45.987392 17891 kuberuntime_manager.go:752] Back-off 5m0s restarting failed container=weave pod=weave-net-x18xm_kube-system(a96a9cfb-193b-11e7-b8e0-02fc636bbb90)
Apr 4 13:48:45 frontend kubelet[17891]: E0404 13:48:45.987440 17891 pod_workers.go:182] Error syncing pod a96a9cfb-193b-11e7-b8e0-02fc636bbb90 ("weave-net-x18xm_kube-system(a96a9cfb-193b-11e7-b8e0-02fc636bbb90)"), skipping: failed to "StartContainer" for "weave" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=weave pod=weave-net-x18xm_kube-system(a96a9cfb-193b-11e7-b8e0-02fc636bbb90)"
Apr 4 13:48:50 frontend kubelet[17891]: W0404 13:48:50.545383 17891 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 4 13:48:50 frontend kubelet[17891]: E0404 13:48:50.546074 17891 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Weave container logs:
2017/04/04 13:45:24 error contacting APIServer: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout; trying with fallback: http://localhost:8080
2017/04/04 13:45:24 Could not get peers: Get http://localhost:8080/api/v1/nodes: dial tcp 127.0.0.1:8080: getsockopt: connection refused
Failed to get peers
And the node never reaches "Ready" state
The issue is that the weave pod is trying to reach the API server through it's VIP
what is the evidence for that?
Get http://localhost:8080/api/v1/nodes
This is the weave pod trying to reach the api-server on an unsecured local address. This is not the configuration you get from current kubeadm with no options.
My bad again - my formatting cut off the first line of each log. I've amended them both.
ok, I have also just set up a cluster with no special options and have no trouble contacting tcp://10.96.0.1:443
However my comment about Weave Net setting up masquerading rules is not relevant. Let me see if I can figure it out.
For comparison, I've just re-run the kubeadm bootstrap with the addition of --pod-network-cidr 10.32.0.0/12, and the weave pod starts properly, the node transitions to Ready.
I suspect that you're not experiencing it because it only applies to certain network configurations - in my case I'm using Vagrant machines with the kube cluster established over secondary private-only interfaces.
In a separate conversation, I have seen this happening:
There are two network interfaces, eth0 and eth1: eth0 has the default route, but we want all traffic to kubernetes to go via eth1.
Adding the --pod-network-cidr causes an extra iptables rule to rewrite the source address. Thus it will now go over the eth1 interfaces. [EDIT: I do not recommend this, because it's essentially an accident that it makes it work]
Another way to get it to work is to add a route telling Linux that all kubernetes service addresses are to go via eth1, like this:
ip route add 10.96.0.0/16 dev eth1 src 192.168.10.100
Personally I find the route more attractive since it makes the right decision earlier. But looking for other voices to comment on whether this is valid.
@bboreham thanks for debugging with me, and thanks for updating with the findings here!
Will test the fixes in my environment.
Interesting - I can confirm that this route approach works on a single
network segment, but would it cause problems across broadcast domains?
Lachlan
On Wed, Apr 5, 2017 at 1:16 AM, Bryan Boreham notifications@github.com
wrote:
In a separate conversation, I have seen this happening:
There are two network interfaces, eth0 and eth1: eth0 has the default
route, but we want all traffic to kubernetes to go via eth1.
- A process, such as Weave Net, opens a connection to the service
address 10.96.0.1- The destination address is re-mapped to the master's eth1 address
192.168.10.90 (the re-mapping is done iptables rules created by kube-proxy.)- The packet is sent on the node's eth1 interface.
- However Linux has already picked the eth0 source address for this
packet, based on the original destination matching the default route.- At the destination it is dropped as coming from the wrong place
Adding the --pod-network-cidr causes an extra iptables rule to rewrite
the source address. Thus it will now go over the eth1 interfaces.Another way to get it to work is to add a route telling Linux that all
kubernetes service addresses are to go via eth1, like this:ip route add 10.96.0.0/16 dev eth1 src 192.168.10.100
Personally I find the route more attractive since it makes the right
decision earlier. But looking for other voices to comment on whether this
is valid.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubeadm/issues/102#issuecomment-291532883,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAystQH0juIo6cXPh5QQeB-1rHlUKUgvks5rsl7WgaJpZM4La21F
.
@bboreham I can confirm that I now have a working ansible setup by adding the routes... :-)
@predakanga all I am trying to do with the route is get Linux to pick a better source address; since we expect all service IPs to get DNATted we don't expect to actually use that route. However I can see that if underlying addresses went out on two different network adapters then my suggestion wouldn't be good.
@thockin I would value your input on my analysis at https://github.com/kubernetes/kubeadm/issues/102#issuecomment-291532883 and the two suggestions to configure SNAT (for connections originating in the host network namespace) or add a route for the service IP range.
@bboreham could you document these findings somehow and somewhere, please?
So more users would know about it?
I only finished debugging @obnoxxx system a couple of hours ago!
I'm documenting it here so someone can say "no! no! you've completely misunderstood".
If that doesn't happen, I'll happily elevate it to proper documentation :slightly_smiling_face:
In a multi-NIC multi-path case, yeah, I think you'd need a route like you suggest. Not sure how to automatically figure that out...
One more thought came to mind: this is nothing to do with the pod network (Weave Net or otherwise), because the thing that is failing is a process in a node's host namespace trying to talk to the api-server on master. So the finding that setting the clusterCIDR makes it work must be accidental.
@bboreham @thockin Anything we can do here or can I go ahead and close this?
It's possible to set --cluster-cidr on kube-proxy by passing --pod-network-cidr to kubeadm init
As I have already described, setting --cluster-cidr is not a valid response to the issue originally reported in a comment on #74. (Although it happens to make the problem go away).
The title here is unhelpful; it relates to a warning message that has absolutely nothing to do with the underlying problem.
I don't really know what kubeadm could do, since the solution seems to relate to the underlying network. Maybe add options to inform your desired "public interface" and "private interface" and have kubeadm recommend network config changes?
I just had a look at the clusterCIDR logic in kube-proxy, and I agree that is a weird corner case.
I agree the static route is appropriate for the 2nd interface, but it's unfortunate. It feels like the kernel should be smarter than that.
I'm running v1.6.1 and thought the error "clusterCIDR not specified, unable to distinguish between internal and external traffic" would be address.
2017-06-06T17:49:17.113224501Z I0606 17:49:17.112870 1 server.go:225] Using iptables Proxier.
2017-06-06T17:49:17.139584294Z W0606 17:49:17.139190 1 proxier.go:309] clusterCIDR not specified, unable to distinguish between internal and external traffic
2017-06-06T17:49:17.139607413Z I0606 17:49:17.139223 1 server.go:249] Tearing down userspace rules.
2017-06-06T17:49:17.251412491Z I0606 17:49:17.251115 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 524288
2017-06-06T17:49:17.252499164Z I0606 17:49:17.252359 1 conntrack.go:66] Setting conntrack hashsize to 131072
2017-06-06T17:49:17.253220249Z I0606 17:49:17.253057 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
2017-06-06T17:49:17.253246216Z I0606 17:49:17.253124 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
how to define the internal and external traffic ?
This error specifically refers to anything outside the clusters Pod IPs.
On Thu, Jun 8, 2017 at 10:29 PM, timchenxiaoyu notifications@github.com
wrote:
how to define the internal and external traffic ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubeadm/issues/102#issuecomment-307298979,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVPwm-x4smcIx1cFAOLOO6FFIn9T6ks5sCNgkgaJpZM4La21F
.
I've seen this problem too. a route to the pod network to the second nic resolved the issue for me. Feels a little fragile though.....
Hi,
I'm running Kubernetes v1.6.6 & v1.7.0 kube-proxy. Getting the same error,
kube-proxy:
W0914 00:15:41.627710 1 proxier.go:298] clusterCIDR not specified, unable to distinguish between internal and external traffic
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:34:20Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
Try the workaround from @damaspi but failed in v1.6.6 and v1.7.0 use to work in v1.5.4.
# kubectl -n kube-system get ds -l "component=kube-proxy" -o json | jq '.items[0].spec.template.spec.containers[0].command |= .+ ["--cluster-cidr=10.96.0.0/12"]' | kubectl apply -f - && kubectl -n kube-system delete pods -l "component=kube-proxy"
error: error validating "STDIN": error validating data: items[0].apiVersion not set; if you choose to ignore these errors, turn validation off with --validate=false
Need guidance to resolve in v1.6.6 & v1.7.0. Thanks.
@bboreham
I don't really know what kubeadm could do, since the solution seems to relate to the underlying network. Maybe add options to inform your desired "public interface" and "private interface" and have kubeadm recommend network config changes?
I don't think kubeadm should be spitting out OS or distro-specific configuration instructions for host networking. I think it's the responsibility of the operator to configure their host appropriately because otherwise it becomes a rabbit hole. We can certainly make it a requirement, though.
What should kubeadm expect for things to work? That if the user wants to use a non-default NIC, they need to add a static route in Linux? Is this a general enough use-case for us to add it as a system requirement?
@bboreham Any ideas on how we can improve our documentation here? Otherwise I'm in favour of closing this because:
[Aside: it bugs me I have to read up and down and through other issues to page the context back in. The problem people wanted resolved is absolutely nothing to do with the title of this issue]
In the setup docs you could say "if you have more than one network adapter, and your Kubernetes components are not reachable on the default route, we recommend you add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter".
[Aside: it bugs me I have to read up and down and through other issues to page the context back in. The problem people wanted resolved is absolutely nothing to do with the title of this issue]
You are not the only one! 😅
In the setup docs you could say "if you have more than one network adapter, and your Kubernetes components are not reachable on the default route, we recommend you add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter".
Cool, I'll try to submit a docs PR for this tomorrow and close this out.
This is now documented in https://github.com/kubernetes/website/pull/6265, so I'm going to close.
This issue seems to track a few different problems at once, so if you're still running into a potential bug, please open a new issue so can better target the root cause.
FWIW, if you use kubeadm to start the cluster, if you specify the "pod-network-cidr", that'll get passed to the kube-proxy when it starts as the "cluster-cidr". For example, weave defaults to using "10.32.0.0/12"...so I used kubeadm init --kubernetes-version=v.1.8.4 --pod-network-cidr=10.32.0.0/12 which started kube-proxy with cluster-cidr=10.32.0.0/12
@bboreham I'm new to this...Would there be an example on how to implement your suggestion "add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter"?
@bamb00 scroll up; there is an example at https://github.com/kubernetes/kubeadm/issues/102#issuecomment-291532883
Caution: if you make a wrong step it may will result in your machine being inaccessible. Generally this will come back after a reboot, unless you configured the bad route to be there on startup.
I do not know an easy way to learn Linux network configuration.
@mindscratch do note this issue has nothing to do with "cluster-cidr"; that was a red herring eliminated around seven months ago. Please open a new issue if you are having new problems.
Semi-serious suggestion for fixing this specific case without requiring the kube-proxy to use ! -s $podCIDR to distinguish host source address:
$ sudo ip ro add local 10.96.0.0/12 table local dev lo
$ sudo iptables -t nat -I KUBE-SERVICES -s 10.96.0.0/12 -d 10.96.0.0/12 -j KUBE-MARK-MASQ
(or possibly some variation with an explicit ... src 10.96.0.0 on the local route... the table local is probably also unnecessary and a bad idea)
$ ip ro get 10.96.0.1
local 10.96.0.1 dev lo src 10.96.0.1
cache <local>
$ curl -vk https://10.96.0.1
...
* Connected to 10.96.0.1 (10.96.0.1) port 443 (#0)
11:32:20.671085 0c:c4:7a:54:0a:e6 > 44:aa:50:04:3d:00, ethertype IPv4 (0x0800), length 74: 10.80.4.149.59334 > 10.80.4.147.6443: Flags [S], seq 2286812584, win 43690, options [mss 65495,sackOK,TS val 209450 ecr 0,nop,wscale 8], length 0
11:32:20.671239 44:aa:50:04:3d:00 > 0c:c4:7a:54:0a:e6, ethertype IPv4 (0x0800), length 74: 10.80.4.147.6443 > 10.80.4.149.59334: Flags [S.], seq 1684666695, ack 2286812585, win 28960, options [mss 1460,sackOK,TS val 208877 ecr 209450,nop,wscale 8], length 0
11:32:20.671315 0c:c4:7a:54:0a:e6 > 44:aa:50:04:3d:00, ethertype IPv4 (0x0800), length 66: 10.80.4.149.59334 > 10.80.4.147.6443: Flags [.], ack 1, win 171, options [nop,nop,TS val 209450 ecr 208877], length 0
However, I have no idea if that covers all of the expected behaviors of those source-specific kube-proxy MASQ rules...
EDIT: this also has all kinds of side-effects for connections to unconfigured service VIPs... they will end up connecting to any matching host network namespace services.
EDIT2: However, even that is probably better than the current behavior of leaking connections to unconfigured 10.96.X.Y service VIPs out via the default route... which is vaguely unsettling
Most helpful comment
Again, @damaspi