Hello,
I have deployed aws-vpc-cni-k8s on kubernetes 1.18. The Pods are getting the secondary IPs assigned properly but the pods are not reachable. I am able to ping the pods only from the machine the pod is running on. Can someone please help me with this ?
Can you run the debugging script - aws-cni-support.sh located under /opt/cni/bin that might help to investigate this further. If you see any specific error lines let us know we can help you further. More details about using aws-cni-support.sh script can be found here https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting.html
Similar issue here on AKS. All IP addresses allocated on the secondary interface are only reachable from the same EC2 instance.
Hi, I have attached the output of the debugging script. For example if I try to ping/curl a POD on a node which IP address is attached to the secondary interface from a node in a different VPC ( and therefor subnet) I can see the traffic coming in on the node, but no return traffic.
With this plugin, do all interfaces share the same characteristics?
My issue got resolved after updating the CNI plugin to version 1.5.7 and recycling of all the worker nodes.
@mirkop-mattr What was the initial CNI version? Also, was updating the version not enough? Was restarting the workers the thing that resolved the issue?
Also, if you have a peered VPC, you might need to set AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS to include those peers, or the pod traffic going to those pods will get SNAT:ed to the node IP (eth0).
@mogren initial CNI version was 1.5.5. Not sure why that had been used since I did not build the infrastructure in the first place. If I remember correctly rolling out the new plugin was not enough. I recycled the nodes so that new IP addresses would be allocated to the secondary interfaces. I did not try restart the nodes, I just replaced them.
Thanks I will try the setting you have recommended as well.
fwiw I ran into the same problem a while ago. Restarting the node that the secondary ENI was attached to seemed to fix the problem.
@ kzidane Did you upgrade the plugin version as well or did you just reboot the node?
@mogren The flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS is only available from v1.6.0. I will try a version upgrade as well as I am currently running v1.5.7.
@mirkop-mattr Thanks for the follow up! Please try v1.6.1, and be sure to have the dockershim.sock mounted, not just update the image tag.
OK, I can confirm that using v1.6.1 and the flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS fixed the issue for a running pod without restarting the pod or the node.
Thanks a lot for testing this @mirkop-mattr!
@mirkop-mattr I just rebooted the node.
Most helpful comment
OK, I can confirm that using v1.6.1 and the flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS fixed the issue for a running pod without restarting the pod or the node.