Is it possible to create a config file on the host like /etc/eks/xxx.config and dynamically populate those environment variable when CNI pod is starting ?
By default, we eks cni daemon set is started in kube-system namespaces. When launching the worker nodes behind proxy or if are using multiple subnets or HTTP_PROXY variables, we will have to go through the hassle of running kubectl edit/patch ds/aws-node -n kube-system and set those environment variables manually which requires user/role to have system:masters permission.
One more advantage of having config file on the host is, if I have multiple autoscaling groups with multiple instance types and if I have to set WARN_ENI_TARGET or WARM_IP_TARGET variables different, I can easily manage those from reading from the config file instead of centralized environment variable on the daemonset manifest yaml.
I'd like to see WARM_IP_TARGET set as a tag in the ASG.
I edited the Issue title to better describe what the feature request is. @vsiddharth and I talked about this today. We think there are pros and cons to both approaches mentioned by @nithu0115 and @craftyc0der and think that the CNI plugin should be enhanced to check for both a well-known config-file path and an ASG tag. The pros for a well-known config override filepath are that other things in the aws-sdk-go use this pattern (things like STS AssumeRoleWithWebIdentity comes to mind) so that would be consistent. Pros for the ASG tag is that unlike a well-known config file approach, the ASG tag approach would not require orchestration with cloud-init or centralized config-management systems to lay down file to the well-known filepath.
Another use case for this is to allow tuning based on the workloads on the target worker node types. Particularly for GPU instances, where the pod count is bound based on the number of available GPU resources (say 4x for a p3.8xlarge) rather than the ENI/IP space. We see lots of unused IP addresses on our GPU nodes as these node types tend to only run a handful of pods.
It would be hugely beneficial to allow these different instance types to have a different warm pool of IP addresses, where we know that only a handful of pods are running on there, even though the instance is capable of much more from an ENI/IP perspective.
Most helpful comment
Another use case for this is to allow tuning based on the workloads on the target worker node types. Particularly for GPU instances, where the pod count is bound based on the number of available GPU resources (say 4x for a p3.8xlarge) rather than the ENI/IP space. We see lots of unused IP addresses on our GPU nodes as these node types tend to only run a handful of pods.
It would be hugely beneficial to allow these different instance types to have a different warm pool of IP addresses, where we know that only a handful of pods are running on there, even though the instance is capable of much more from an ENI/IP perspective.