Problem statement:
As of today, VPC CIDR ranges are cached during initialization. If new CIDR ranges are added afterwards to address IP space issue, the CNI should be restarted to fetch new CIDR ranges to update the cache to add ip rules/routes to reach to other pods in the cluster with new subnet IP range.
Solution:
Refresh VPC CIDR ranges cache every 2 seconds to avoid staleness.
Steps to replicate the issue:
1) Create EKS cluster in a VPC which has just one CIDR range (10.10.0.0/16)
2) Create worker nodes in the above VPC CIDr range
3) Add secondary CIDR range (100.10.0.0/16) to existing VPC
4) Launch new worker nodes in the subnet which has 100.10.0.0/24 CIDR
5) Pod 1 with 100.10.12.13 IP part of secondary VPC CIDR subnet cannot be talk to coreDNS pod with IP 10.10.23.45 which is part of primary VPC CIDR subnet.
Every two seconds may be a bit excessive. Certainly we should refresh the VPC CIDR ranges periodically, but I think 15 or 30 seconds might be a better interval. Alternately, is there a way we can be notified of VPC subnets being created, deleted or modified instead of periodically refreshing our view?
Every two seconds may be a bit excessive. Certainly we should refresh the VPC CIDR ranges periodically, but I think 15 or 30 seconds might be a better interval.
Agreed! I will go with 15 seconds then.
Alternately, is there a way we can be notified of VPC subnets being created, deleted or modified instead of periodically refreshing our view?
I did some research around adding a watch/poll to get notifications about VPC changes without making AWS API calls, but did not find a way.
Before getting the rollout for the fix in next couple versions, I would to post the couple suggestions/workarounds for users having the cache issue after adding secondary VPC CIDR:
aws-node Pod should aware the secondary VPC CIDR, then replace the old one to ensure it won't impact existing environment.Just to add a note: Simple refresh the cache for 2 second / or 15 second won鈥檛 solve this problem
it will only make new pods ok, but not old pods
@M00nF1sh, I was planning to add ticker function to update cache every x seconds and if there are any updates like additional CIDRs, then re-configure the rules/routes which will should fix both old and new pods.
We have also been facing this problem recently and have temporarily solved using https://github.com/giantswarm/aws-cni-restarter wcan help test this once PR #903 has been merged
@paurosello #903 has been merged, and some follow up PRs as well, so in order to test these changes you would have to use a build of the latest master branch. Be sure to use the configs in /config/master since the config have been changed quite a bit, including adding an init container. Not sure yet when we will have a v1.7.0-rc1 ready for testing.
Has this been fixed with 1.7.0?
Ah yes, I see it in the release notes. https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.7.0
Then I recommend to close this issue.
Yes @marians, resolving.
Most helpful comment
Before getting the rollout for the fix in next couple versions, I would to post the couple suggestions/workarounds for users having the cache issue after adding secondary VPC CIDR:
aws-nodePod should aware the secondary VPC CIDR, then replace the old one to ensure it won't impact existing environment.