Containers-roadmap: [EKS] [request]: Add/Delete/Update Subnets Registered with the Control Plane

Created on 23 Feb 2019  路  58Comments  路  Source: aws/containers-roadmap

Tell us about your request
The ability to update the Subnets that the EKS Control plane is registered with.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
https://twitter.com/julien_fabre/status/1099071498621411329

Are you currently working around this issue?

Additional context

EKS Proposed

Most helpful comment

This would be a nice improvement

All 58 comments

/cc @Pryz

This would be a nice improvement

To add some color, here are some use cases :

  1. You have a multi-tenant cluster configured with X number of subnets but you are getting close to IP exhaustion and you want to extend the setup to Y more subnets. Without losing the current configuration of course.

  2. You are expending your setup to new availability zones and so want to use new subnets to schedule PODs there.

  3. You were using your cluster on private subnets only and now want to extend to use some public subnets.

Generally, in many environment, network setup are moving and EKS need to be flexible enough to embrace such changes.

Thanks !

@Pryz Your worker nodes don't have to be in the same subnets that your control plan is configured for. The latter are used for creating the ENIs that are used for kubectl log|exec|attach and for ELB/NLB placement.

@devkid yes but that's a problem. You can basically schedule PODs on subnets which are not configured on the master but can't access it (logs, proxy, whatever).

@Pryz If you have proper routing between the different subnet, this is not a problem. We have configured our control plane for one set of subnets and our workers run in a second, disjoint set of subnets and logs, proxy, exec are working just fine.

@Pryz Could you explain detail how to routing between the different subnets? I am just wondering how to access to disjoint set of subnets. I hope your reply! Thanks

@hanjunlee I'm not sure I understand your question. My setup is quite simple : 1 VPC, up to 5 CIDRs with 6 subnets each (3 private subnets, 3 public subnets).
Each AZ gets 2 routing tables (1 for private subnets and 1 for public subnets). There is no issue in this routing setup. Any IP from a subnet can talk with any other IP from any of the other subnets.

I am seeing the same issue. I have configured 2 subnets initially but my CIDR range was too small for IP assignemts from the cluster. So I added new subnets to the VPC and the worker nodes are running fine in these new subnets.

When using kubectl proxy and accessing the URL I get an error of :

Error: 'Address is not allowed'
Trying to reach: 'https://secondary-ip-of-worker-node:8443/'

Control Pane ENI in old subnets, worker nodes in new subnets and kubectl host have all in and outbound rules for each other. I would think this is related to the issue of new subnets not having an attached ENI for the control pane. Any help would be appreciated.

@aschonefeld did you tag the new subnets with the correct kubernetes.io/cluster/<clustername> tag set to shared?

I could not reproduce the error when I tagged the subnet that way. Spun up a node in the subnet and started a pod on that node. Afterwards I could run kubectl logs|exec|port-forward on that Pod.

@ckassen the cloud formation temlate for the worker nodes tagged the subnet to shared. Kubectl logs for example is also working for me but proxy to the dashboard is not working. Command goes through but no connect to the dashboard is possible. Is kubectl. proxy working for you?

If you decide your load balancers are in the wrong subnet, and decide to create new ones, as far as I can tell EKS doesn't detect the new subnets, and still creates the load balancers in the old subnets, even though they're no longer tagged with kubernetes.io/role/internal-elb. Being able to add the new subnets to EKS would be useful.

Any work-around known so far? Terraform wants to create a new EKS cluster for me after adding new subnets :-/

@thomasjungblut are you using this module? https://github.com/terraform-aws-modules/terraform-aws-eks the cluster aws_eks_cluster.this resource wants to be replaced?

We built our own module, but effectively it's the same TF resource that wants to be replaced yes.

It makes sense, since the API doesn't support changing the subnets: https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html

Would be cool to at least have the option of adding new subnets. The background is that we want to switch from public to private subnets, so we could add our private subnets on top of the existing public ones and just change the routes a bit. Would certainly make our life a bit easier :)

We just ran into this exact issue, using that EKS TF Module too. A workaround that seems to work:

1) Create the new subnets, setup the routes, etc
2) Manually edit the ASG for the worker nodes, add the subnets
3) Edit the control plane SG, add the CIDR ranges of the new subnets

This, of course, breaks running TF with the EKS module for that cluster again. We're hoping to mitigate that by tightening up the TF code so we can just create new properly sized VPC/subnets, and kill the old EKS cluster entirely

We're trying to make a custom TF module that will do the above work, without using the EKS module, so at least we can apply that programatically in the future if needed and that cluster is still around

We are having the same problem. We needed to add new subnets to the EKS cluster and we had to rebuild it since the aws eks update-cluster-config --resources-vpc-config does not allow to update Subnet nor Security Groups once the cluster has been built.

An error occurred (InvalidParameterException) when calling the UpdateClusterConfig operation: subnetIds and securityGroupIds cannot be updated.

Do we really need to rebuild the entire cluster to add a subnet?

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround didn't work for us :( We created new subnets, tagged them with kubernetes.io/cluster/eks: shared tag and ran the EKS upgrade with no changes to the EKS attached subnets. Anything we missed?

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.
Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. >
Upgrade the cluster, new tagged subnets will be discovered automatically.

The same as @henkka , The workaround didn't work for me too.
what should I do?

@QCU Did you modify the control plane security groups with the new CIDRs?

@qcu Did you modify the control plane security groups with the new CIDRs?

^-- @QCU266: this was for you.

The process that worked for us using terraform:
Created new subnet(s)
Tagged new subnets with kubernetes.io/cluster/ tag set to shared -- our subnets share the same route table, but if they didn't, we'd have tagged that too.
Modify security group for the control plane to add the new CIDR.
Cluster schedules pods just fine, and stuff like logs|proxy|etc work with no issue.

We use the EKS terraform module, and all of this was duable with terraform. The worker node block will happily accept a subnet that isn't one of those declared with the cluster initially. No manual changes required.

Assuming that the assets are properly tagged, I'd venture that the kubectl issues encountered above are down to SG configuration, not inherently EKS-related.

Is there any plan for this proposal? The proposal is created for almost a year and AWS seems doesn't have any plan for it.

@BertieW using the TF module with the new subnets added to the worker groups "forces replacement" of the EKS cluster.

All subnets (w/ tags), are added to the VPC and SG's, and they share same route table and are within the same VPC. As @thomasjungblut pointed out, the TF module appears to be restricted by the AWS API limitations and cannot add the additional subnets to the existing cluster. I don't see anyway to get around replacing the EKS cluster if deploying from the TF module :(

Alright found a solution for the TF EKS module.

You can't make changes to the module subnets parameter. So if you were calling these subnets via a variable as I was like so:

subnets = module.vpc.private_subnets

You'll need to provide a static definition for precisely those subnets used when creating your cluster, example:

subnets = ["subnet-085cxxxx", "subnet-0a60xxxx", "subnet-0be8xxxx"]

Then within your worker groups you can add your new subnets. Afterwards you'll be able to apply the TF without forcing a cluster replacement.

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround is not working anymore. The subnets chosen while creating the EKS cluster will be used by control plane nodes only. However, users can use different subnets for workerNodes. EKS will be able to register worker nodes on different subnets which were not part of initial set of subnets when the cluster was created.

You may attach a new CIDR range to the VPC and carve subnets out of newly created CIDR range and add tags to it to make sure EKS discovers those new subnets.

Worker nodes can be launched in new CIDR range subnets in the same VPC after cluster creation without any issues.

+1

+1

+1

This has been one of those issues that should just not exist IMHO, having to rebuild a cluster just to provide the cluster with some additional target subnets is really undesirable. We are in the position of having to rebuild one of our test clusters (luckily not a cluster hosting production workloads)

Been searching for a way to do this for a while.

We have 6 different EKS clusters that were built with either only public or only private subnets. I'd very much like to correct all 6 of them. They were all built with terraform-aws-modules/eks/aws, and I'd like to keep maintaining them with such.

Another issue: unused subnet was deleted, but EKS still has it listed under Networking. Tried to turn on CloudWatch logging as we have problems with K8s API, but it fails with the message:

InvalidRequestException The subnet ID 'subnet-xxxx892f' does not exist (Service: AmazonEC2; Status Code: 400; Error Code: InvalidSubnetID.NotFound; Request ID: ce958c55-3c8a-4154-8cfa-25891bf8ca3f)

EDIT: why was deleting the subnet allowed, when this possibly broke the whole EKS cluster? K8s API i/o timeout issues, nodes not joining cluster, resolving issues, unability to turn on logging. And no way to solve it, even via support.

We are facing the same issue. We are trying to increase the number of subnets on the EKS cluster an this is not possible without cluster recreation. This scenario has been tested by AWS support too with could formation and they reach the same result I did with terraform. The only solution is to change the core behavior of EKS to update the number of subnets on the fly. We can't afford to redeploy an entire production cluster just because EKS does not support a "normal action" to update the numbers of subnets that are using. Please provide feedback when this functionality will be implemented cause production clusters are affected by this missing.

+1 this is very much needed is there any timeline on the addition of subnets without recreation ?
Is there any workaround for it ?

Just don't delete any of the subnets (it will let you if there are no used IPs), or you will be forced to recreate an entire EKS cluster.

@thorro So we cannot start new pods in the cluster since there are not enough IPs. So in the same VPC we have created some more subnets is it possible to use those subnets for solving this use-case without recreating the cluster ?

@manikandanmuthu44 we had no problems with that, just tag them.

We also ran out of IP addresses due to initially choosing too smaller subnets for EKS. We created bigger subnets, tagged them, and untagged the smaller subnets, and one-by-one terminated the nodes waiting for each new node to accept pods before going onto the next one. It solved our problem in production with no downtime.

The only issue is the control plane is stuck with the old subnets, meaning we can't tidy them up.

@tomfotherby Thanks whatever you said worked out we did similar to it, created new bigger subnets and tagged them and booted nodes got associated with cluster and pods started picking up IPs in the new bigger subnet

The behaviour of not allowing subnets to be modified after creation is highly undesirable.

$ aws eks update-cluster-config --name eks-cluster --resources-vpc-config subnetIds=subnet-abc,subnet-def,subnet-123

An error occurred (InvalidParameterException) when calling the UpdateClusterConfig operation: subnetIds and securityGroupIds cannot be updated.

In my case a one character typo on terraform resulted in a cluster being created in the wrong AZ:
availability_zone = var.azs[2]
I have tried a couple of things but in the end it was faster to rebuild the cluster.
It would appear there is no easy way to edit the subnets on EKS.

+1

+1 for EKS feature

I was able to add AZ/subnets manually but it was a hack to not be able to configure the EKS cluster the same.

+1

@nitrag how did you accomplish this? you were able to add subnets to an existing EKS cluster?

@Ticiano-mw

Added subnets, added route table association. Updated ASG's with additional vpc_zone_identifier's (subnets).

@Ticiano-mw how did you accomplish this?

@Ticiano-mw

Added subnets, added route table association. Updated ASG's with additional vpc_zone_identifier's (subnets).

But there are no ASGs for the EKS control plane hosts

Does anyone has a workaround for this? None of the solutions presented worked for me...
Btw, this issue has a long time, AWS should address this asap

Unfortunately it is necessary to recreate the cluster to update the subnets.

This is very bad, I need to include a subnet and I will need to recreate my 5 clusters

We used to have /20 subnets but ip pool was not enough, we created extra subnets (/19), routing and ASG with EKS required tags, it works very well, the new subnets doesn鈥檛 have to be registered to control plane.

I was in a similar position recently. A single worker group in private subnets without enough IPs in the CIDR.

FWIW here are the steps I took with the EKS module:

  1. Added secondary VPC CIDR and three additional private subnets to use that range. Key at this stage was hard-coding the subnets IDs that are passed to the EKS module (as mentioned by cazter above) otherwise terraform wants to destroy and recreate the existing autoscaling group(s) which will destroy the worker nodes too.
  2. Added a second worker group in the new subnets that were created in step 1. Again, the subnets IDs for these need to be hardcoded when passed to the EKS module.
  3. Disable cluster-autoscaler on the original worker group. (This is done with the autoscaling_enabled = false option in the EKS module)
  4. Gracefully drain all the nodes in the original worker group, cluster-autoscaler will bring up new nodes in the new group as needed and pods will migrate to them.
  5. Scale the original group to zero (set asg_min_size=0 and asg_max_size=0, apply and then delete the nodes manually) See note in FAQ
  6. At this point the hardcoded subnets can be swapped out for the appropriate resource attributes and the terraform code is 'clean' of the nasty hard-coding.

This doesn't change the fact that only the original subnets are registered with the control plane of course, but as others have said it does give fully operational pods and those pods are accessible by the control plane. Plus the terraform code is still sane when you're finished.

There is however a big gotcha, custom apiservices no longer work as the pods that back them are now in the new subnets and the control plane can't discover them. This breaks autoscaling for example because HorizontalPodAutoscaler resources rely on metrics-server.

$ kubectl get apiservice
NAME                                   SERVICE                                      AVAILABLE                      AGE
[...]
v1beta1.metrics.k8s.io                 kube-system/metrics-server                   False (FailedDiscoveryCheck)   104d

I worked around this by bringing back up the original workergroup (in the original subnets) and using taints/tolerations and nodeselectors to pin the metrics-server pods to the workers in that group. Unfortunately bringing back that worker group also means going back to hardcoded subnet definitions. So it's a workable but very undesirable solution.

$ kubectl get apiservice
NAME                                   SERVICE                                      AVAILABLE                      AGE
[...]
v1beta1.metrics.k8s.io                 kube-system/metrics-server                   True                           104d

As everyone else has already said, this needs fixing properly.

+1

I can confirm that control plane restart has fixed "Address not allowed" issue.
Assuming new subnets in secondary CIDR have have proper tags kubernetes.io/cluster/mycluster,Value=shared.
Enable custom CNI config in aws-node daemonset:

        - "name": "AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG"
          "value": "true"
        - "name": "ENI_CONFIG_LABEL_DEF"
          "value": "failure-domain.beta.kubernetes.io/zone"

apply ENIconfig configmap:

apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
  name: us-east-1a
spec:
  securityGroups:
    - sg-0f50a320<id>
  subnet: subnet-03bfe23a<id>
...
# and similar for other AZs

Roll worker nodes and see that pods are now being assigned into new subnet (Nodes subnets remain "old" ones, no reason to change subnets in ASG).

metrics-server-b6866958f-q9fk2             1/1     Running   0          20h   10.7.13.140   ip-10-7-2-113.ec2.internal   <none>           <none>
metrics-server-b6866958f-w7cfq             1/1     Running   0          20h   10.7.10.132   ip-10-7-1-157.ec2.internal   <none>           <none>

Check API resources status and find it's failed https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Address is not allowed:

k describe apiservices.apiregistration.k8s.io v1beta1.metrics.k8s.io
...
    Message:               failing or missing response from https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Get https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Address is not allowed
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

After that simply update EKS version from version 1.17 to 1.18 and check status after updating:

k describe apiservices.apiregistration.k8s.io v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
...
Status:
  Conditions:
    Last Transition Time:  2020-10-15T19:26:42Z
    Message:               all checks passed
    Reason:                Passed
    Status:                True
    Type:                  Available
Events:                    <none>

I suppose EKS control plan reads subnets tags on startup time and creates a list to pass to --proxy-cidr-whitelist parameter. Adding new subnets while cluster is running doesn't trigger API server config reload so new cidr are not in --proxy-cidr-whitelist list.
On the other hand version update causes EKS control plane to restart so it regenerates allowed IP ranges to connect/proxy by control plane.

This behavior has been tested on at least 3 different clusters.
Unfortunately there's no way to restart EKS control plan on demand so version upgrade is the only way to do it.

I just upgraded my EKS Managed Node Groups from 1.17 -> 1.18. During the upgrade process, My node groups went degraded because one of the AZ no longer had instance type availability. I could not change the node group network settings for this selection. I tried to adjust this through the ASG directly which did not reflect on the control plane side. This really needs to be fixed. An upgrade process should not be this painful for a managed service.

+1

Same problem, HPA is completely broken when custom ENI is used after enabled secondary CIDR block for existing EKS cluster, we are in 1.18 already thus no workaround for us to force control plane restart.

Was this page helpful?
0 / 5 - 0 ratings