Today, I created a new cluster from scratch with three instance groups: "nodes" (m4.large), "compute" (c4.large), and "memory" (r4.large). After deploying kube2iam, I noticed one pod was crashing with "route ip+net: no such network interface". I SSH'd into the instance, and saw that, indeed, there was no cbr0 interface. I couldn't find anything in /var/log/daemon.log to indicate a cause: cbr0 is first mentioned when (I think) the bridge should be created, and the next all come from the crashing kube2iam pods. dmesg -w had nothing interesting, either.
This might be a larger, more systemic issue in Kops. Out of curiosity, I switch the cluster to use an AMI based on Ubuntu 16.04, and only kept the m4.large and c4.large instance groups. This time, however, only the c4.large correctly created the cbr0 interface, while the m4.large did not; when I recreated the r4.large group, it did not create the interface, either. I'm going to try terminating the instances and seeing if I can find any rhyme or reason to when nodes successfully come up or not, or maybe just switch to a different networking solution entirely.
I think I finally figured out the actual root cause: it seems that, until a pod without hostNetworking is scheduled onto a node, the cbr0 interface isn't created for that node! After extending the manifest to add another daemonset that runs sleep infinity, all of the kube2iam pods start successfully.
Tks @kinghajj
any better ideas than this?
for me adding --host-interface=cbr+ fixed it
Most helpful comment
I think I finally figured out the actual root cause: it seems that, until a pod without
hostNetworkingis scheduled onto a node, thecbr0interface isn't created for that node! After extending the manifest to add another daemonset that runssleep infinity, all of the kube2iam pods start successfully.