Amazon-vpc-cni-k8s: Pods with Secondary IPs are not reachable

Created on 4 Apr 2020  路  13Comments  路  Source: aws/amazon-vpc-cni-k8s

Hello,

I have deployed aws-vpc-cni-k8s on kubernetes 1.18. The Pods are getting the secondary IPs assigned properly but the pods are not reachable. I am able to ping the pods only from the machine the pod is running on. Can someone please help me with this ?

needs investigation

Most helpful comment

OK, I can confirm that using v1.6.1 and the flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS fixed the issue for a running pod without restarting the pod or the node.

All 13 comments

Can you run the debugging script - aws-cni-support.sh located under /opt/cni/bin that might help to investigate this further. If you see any specific error lines let us know we can help you further. More details about using aws-cni-support.sh script can be found here https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting.html

Similar issue here on AKS. All IP addresses allocated on the secondary interface are only reachable from the same EC2 instance.

Hi, I have attached the output of the debugging script. For example if I try to ping/curl a POD on a node which IP address is attached to the secondary interface from a node in a different VPC ( and therefor subnet) I can see the traffic coming in on the node, but no return traffic.

With this plugin, do all interfaces share the same characteristics?

aws-cni-support.tar.gz

My issue got resolved after updating the CNI plugin to version 1.5.7 and recycling of all the worker nodes.

@mirkop-mattr What was the initial CNI version? Also, was updating the version not enough? Was restarting the workers the thing that resolved the issue?

Also, if you have a peered VPC, you might need to set AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS to include those peers, or the pod traffic going to those pods will get SNAT:ed to the node IP (eth0).

@mogren initial CNI version was 1.5.5. Not sure why that had been used since I did not build the infrastructure in the first place. If I remember correctly rolling out the new plugin was not enough. I recycled the nodes so that new IP addresses would be allocated to the secondary interfaces. I did not try restart the nodes, I just replaced them.
Thanks I will try the setting you have recommended as well.

fwiw I ran into the same problem a while ago. Restarting the node that the secondary ENI was attached to seemed to fix the problem.

@ kzidane Did you upgrade the plugin version as well or did you just reboot the node?

@mogren The flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS is only available from v1.6.0. I will try a version upgrade as well as I am currently running v1.5.7.

@mirkop-mattr Thanks for the follow up! Please try v1.6.1, and be sure to have the dockershim.sock mounted, not just update the image tag.

OK, I can confirm that using v1.6.1 and the flag AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS fixed the issue for a running pod without restarting the pod or the node.

Thanks a lot for testing this @mirkop-mattr!

@mirkop-mattr I just rebooted the node.

Was this page helpful?
0 / 5 - 0 ratings