Terraform-aws-eks: Support CNI Custom Networking

Created on 19 Sep 2019 · 6Comments · Source: terraform-aws-modules/terraform-aws-eks

I'm submitting a...

[ ] bug report
[x] feature request
[ ] support request
[ ] kudos, thank you, warm fuzzy

to add the ability to configure CNI custom networking

What is the current behavior?

Currently the module does not allow custom k8s configuration (like the auth config) to be applied before the worker nodes are created.

For CNI custom networking to work on a particular node, it is necessary for the aws-node daemonset to be patched (e.g. AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG env var set), ENIConfig applied and the kubelet parameters (--node-labels) modified before any Pod not using the host's networking namespace is scheduled to it.

Currently it's possible for some pods not using host networking (e.g. coredns applied by EKS) to land on a node before the above has happened, so nodes can come up in different states (i.e. some using the node's subnet for Pod IPs, some using a secondary subnet and some using both) depending on where these Pods land. Problem nodes (i.e. any using the node's subnet for Pod IPs) then need to be terminated.

What's the expected behavior?

To allow for CNI custom networking to be configurable in the module and to order the creation of the worker nodes so that the necessary steps happen before Pods are scheduled.

Are you able to fix this problem and submit a PR? Link here if you have already.

We (me and others) would be keen to submit a fix, but this is not yet done. Our proposed solution would be to do something similar to how the aws auth config is applied, but wanted to gather feedback as to how narrow the aim of this PR should be (i.e. should it aim to just address the CNI specific stuff, or should arbitrary resources be allowed).

I feel the former (just the CNI config) would be best as most configuration doesn't have the same race condition as described above.

Any other relevant info

There may be a recommended way of addressing this problem in the current state of module, but have not come across it yet.

Adding the --node-labels flag to the kubelet is not an issue as it can be solved already by modifying the pre_userdata, but just included it here for completeness.

Source

dippynark

👍8

Most helpful comment

Closing. You can use the helm chart: https://github.com/aws/eks-charts/tree/master/stable/aws-vpc-cni

Or do something outside of this module.

max-rocket-internet on 20 Nov 2019

👍2

All 6 comments

max-rocket-internet on 25 Sep 2019

Hey @dippynark 👋

Essentially, we try to resist requests like this one because:

It adds too much complexity to the module. This module is probably already the most (!??) complex in the registry
It can probably be achieve by wrapping this module in another module (or outside of this module)
There is MUCH better tools for this (Helm, Addons, Kustomize etc)
Where does it end? Which CNIs? What about Helm/Addons/storage-classes/SSM/insights/etc?

You can see some other examples referenced in this issue: https://github.com/terraform-aws-modules/terraform-aws-eks/issues/99

Please convince me otherwise though 😄

max-rocket-internet on 25 Sep 2019

1) I agree it's already quite complex and this PR won't help with that
2) If you use this module to provision worker nodes, you can't (afaik) guarantee the worker nodes come up with the correct CNI configuration unless you terminate all the worker nodes after the ENI config is applied as described above, which is why I feel this module is the only sensible place this can be handled
3) Not too sure how those tools can help in this case - the ENI config can be applied by them but it's the ordering of the application of the config before any pods are scheduled that they cannot solve. IMO it's the implementation of the AWS CNI plugin that is at fault as its configuration converges to a different place depending on when the ENI config is applied, but not sure if that is going to change any time soon
4) This only concerns the AWS CNI plugin

I feel this is different to what's discussed in #99 as it's not functionality you could add as a wrapper as it requires the module to know about the ordering requirements - our plan is to maintain a fork with this ability, definitely understand the desire to keep complexity at a minimum here.

dippynark on 26 Sep 2019

👍1

I feel this is different to what's discussed in #99

I don't just mean discussion in that issue, I also mean in all the referenced issues.

This only concerns the AWS CNI plugin

I know but look at all the other requests we get 🙂

Not too sure how those tools can help in this case

https://github.com/aws/amazon-vpc-cni-k8s/issues/336 😅

If you use this module to provision worker nodes, you can't (afaik) guarantee the worker nodes come up with the correct CNI configuration

What about if a dependency is created like this: cluster > external thing > all worker groups where external thing is something outside of the module that runs and sets up the CNI after cluster is created?

Maybe we could make that dependency in an a nice and elegant way that would enable you to run some kubectl patch xxx commands to edit the CNI daemonset? It could also be super handy for other things.

I think what we don't want is to tie this module in any way to release process of the CNI or any other unrelated things. For example getting PRs every time there's a new release or setting in the CNI.

max-rocket-internet on 26 Sep 2019

I would really like this feature as well, but maybe it would be better to pressure amazon to allow cni variables to be defined at cluster creation time instead of after cluster creation. It seems hacky for aws to suggest patching then terminating instances in their eks workshop

Then it would be a simple update.