containers-roadmap 🚀 - [EKS] [request]: Add/Delete/Update Subnets Registered with the Control Plane

/cc @Pryz

christopherhein on 23 Feb 2019

This would be a nice improvement

jmeichle on 23 Feb 2019

👍34

To add some color, here are some use cases :

You have a multi-tenant cluster configured with X number of subnets but you are getting close to IP exhaustion and you want to extend the setup to Y more subnets. Without losing the current configuration of course.
You are expending your setup to new availability zones and so want to use new subnets to schedule PODs there.
You were using your cluster on private subnets only and now want to extend to use some public subnets.

Generally, in many environment, network setup are moving and EKS need to be flexible enough to embrace such changes.

Thanks !

Pryz on 23 Feb 2019

👍18

@Pryz Your worker nodes don't have to be in the same subnets that your control plan is configured for. The latter are used for creating the ENIs that are used for kubectl log|exec|attach and for ELB/NLB placement.

devkid on 23 Feb 2019

👍4

@devkid yes but that's a problem. You can basically schedule PODs on subnets which are not configured on the master but can't access it (logs, proxy, whatever).

Pryz on 4 Mar 2019

@Pryz If you have proper routing between the different subnet, this is not a problem. We have configured our control plane for one set of subnets and our workers run in a second, disjoint set of subnets and logs, proxy, exec are working just fine.

devkid on 4 Mar 2019

@Pryz Could you explain detail how to routing between the different subnets? I am just wondering how to access to disjoint set of subnets. I hope your reply! Thanks

hanjunlee on 10 Apr 2019

@hanjunlee I'm not sure I understand your question. My setup is quite simple : 1 VPC, up to 5 CIDRs with 6 subnets each (3 private subnets, 3 public subnets).
Each AZ gets 2 routing tables (1 for private subnets and 1 for public subnets). There is no issue in this routing setup. Any IP from a subnet can talk with any other IP from any of the other subnets.

Pryz on 10 Apr 2019

👍2

I am seeing the same issue. I have configured 2 subnets initially but my CIDR range was too small for IP assignemts from the cluster. So I added new subnets to the VPC and the worker nodes are running fine in these new subnets.

When using kubectl proxy and accessing the URL I get an error of :

Error: 'Address is not allowed'
Trying to reach: 'https://secondary-ip-of-worker-node:8443/'

Control Pane ENI in old subnets, worker nodes in new subnets and kubectl host have all in and outbound rules for each other. I would think this is related to the issue of new subnets not having an attached ENI for the control pane. Any help would be appreciated.

aschonefeld on 3 May 2019

@aschonefeld did you tag the new subnets with the correct kubernetes.io/cluster/<clustername> tag set to shared?

I could not reproduce the error when I tagged the subnet that way. Spun up a node in the subnet and started a pod on that node. Afterwards I could run kubectl logs|exec|port-forward on that Pod.

ckassen on 24 May 2019

👍3

@ckassen the cloud formation temlate for the worker nodes tagged the subnet to shared. Kubectl logs for example is also working for me but proxy to the dashboard is not working. Command goes through but no connect to the dashboard is possible. Is kubectl. proxy working for you?

aschonefeld on 28 May 2019

If you decide your load balancers are in the wrong subnet, and decide to create new ones, as far as I can tell EKS doesn't detect the new subnets, and still creates the load balancers in the old subnets, even though they're no longer tagged with kubernetes.io/role/internal-elb. Being able to add the new subnets to EKS would be useful.

willthames on 31 May 2019

Any work-around known so far? Terraform wants to create a new EKS cluster for me after adding new subnets :-/

thomasjungblut on 6 Aug 2019

😕7

@thomasjungblut are you using this module? https://github.com/terraform-aws-modules/terraform-aws-eks the cluster aws_eks_cluster.this resource wants to be replaced?

sjmiller609 on 6 Aug 2019

👀1

We built our own module, but effectively it's the same TF resource that wants to be replaced yes.

It makes sense, since the API doesn't support changing the subnets: https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html

Would be cool to at least have the option of adding new subnets. The background is that we want to switch from public to private subnets, so we could add our private subnets on top of the existing public ones and just change the routes a bit. Would certainly make our life a bit easier :)

thomasjungblut on 7 Aug 2019

We just ran into this exact issue, using that EKS TF Module too. A workaround that seems to work:

1) Create the new subnets, setup the routes, etc
2) Manually edit the ASG for the worker nodes, add the subnets
3) Edit the control plane SG, add the CIDR ranges of the new subnets

This, of course, breaks running TF with the EKS module for that cluster again. We're hoping to mitigate that by tightening up the TF code so we can just create new properly sized VPC/subnets, and kill the old EKS cluster entirely

We're trying to make a custom TF module that will do the above work, without using the EKS module, so at least we can apply that programatically in the future if needed and that cluster is still around

theothermike on 8 Aug 2019

We are having the same problem. We needed to add new subnets to the EKS cluster and we had to rebuild it since the aws eks update-cluster-config --resources-vpc-config does not allow to update Subnet nor Security Groups once the cluster has been built.

An error occurred (InvalidParameterException) when calling the UpdateClusterConfig operation: subnetIds and securityGroupIds cannot be updated.

jahzielHA on 30 Aug 2019

😕6

Do we really need to rebuild the entire cluster to add a subnet?

flythebluesky on 19 Sep 2019

😕27

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

mailjunze on 22 Oct 2019

👍1

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround didn't work for us :( We created new subnets, tagged them with kubernetes.io/cluster/eks: shared tag and ran the EKS upgrade with no changes to the EKS attached subnets. Anything we missed?

henkka on 24 Oct 2019

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.
Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. >
Upgrade the cluster, new tagged subnets will be discovered automatically.

The same as @henkka , The workaround didn't work for me too.
what should I do?

QCU266 on 11 Nov 2019

@QCU Did you modify the control plane security groups with the new CIDRs?

BertieW on 29 Nov 2019

@qcu Did you modify the control plane security groups with the new CIDRs?

^-- @QCU266: this was for you.

qcu on 2 Dec 2019

The process that worked for us using terraform:
Created new subnet(s)
Tagged new subnets with kubernetes.io/cluster/ tag set to shared -- our subnets share the same route table, but if they didn't, we'd have tagged that too.
Modify security group for the control plane to add the new CIDR.
Cluster schedules pods just fine, and stuff like logs|proxy|etc work with no issue.

We use the EKS terraform module, and all of this was duable with terraform. The worker node block will happily accept a subnet that isn't one of those declared with the cluster initially. No manual changes required.

Assuming that the assets are properly tagged, I'd venture that the kubectl issues encountered above are down to SG configuration, not inherently EKS-related.

BertieW on 11 Dec 2019

Is there any plan for this proposal? The proposal is created for almost a year and AWS seems doesn't have any plan for it.

khacminh on 15 Jan 2020

👍10

@BertieW using the TF module with the new subnets added to the worker groups "forces replacement" of the EKS cluster.

All subnets (w/ tags), are added to the VPC and SG's, and they share same route table and are within the same VPC. As @thomasjungblut pointed out, the TF module appears to be restricted by the AWS API limitations and cannot add the additional subnets to the existing cluster. I don't see anyway to get around replacing the EKS cluster if deploying from the TF module :(

cazter on 13 Feb 2020

Alright found a solution for the TF EKS module.

You can't make changes to the module subnets parameter. So if you were calling these subnets via a variable as I was like so:

subnets = module.vpc.private_subnets

You'll need to provide a static definition for precisely those subnets used when creating your cluster, example:

subnets = ["subnet-085cxxxx", "subnet-0a60xxxx", "subnet-0be8xxxx"]

Then within your worker groups you can add your new subnets. Afterwards you'll be able to apply the TF without forcing a cluster replacement.

cazter on 13 Feb 2020

👍1

Workaround : If your EKS cluster is behind the current latest version and planning to upgrade.

Control-plane discovers the subnets at the time of initialization process. Tag the new subnets. Upgrade the cluster, new tagged subnets will be discovered automatically.

This workaround is not working anymore. The subnets chosen while creating the EKS cluster will be used by control plane nodes only. However, users can use different subnets for workerNodes. EKS will be able to register worker nodes on different subnets which were not part of initial set of subnets when the cluster was created.

You may attach a new CIDR range to the VPC and carve subnets out of newly created CIDR range and add tags to it to make sure EKS discovers those new subnets.

Worker nodes can be launched in new CIDR range subnets in the same VPC after cluster creation without any issues.

mailjunze on 5 Mar 2020

👍4

+1

sbonasu on 19 Mar 2020

+1

chunsli on 30 Mar 2020

+1

jancurn on 1 Apr 2020

This has been one of those issues that should just not exist IMHO, having to rebuild a cluster just to provide the cluster with some additional target subnets is really undesirable. We are in the position of having to rebuild one of our test clusters (luckily not a cluster hosting production workloads)

shackit on 7 Apr 2020

Been searching for a way to do this for a while.

We have 6 different EKS clusters that were built with either only public or only private subnets. I'd very much like to correct all 6 of them. They were all built with terraform-aws-modules/eks/aws, and I'd like to keep maintaining them with such.

blongman-snapdocs on 14 Apr 2020

👍2

Another issue: unused subnet was deleted, but EKS still has it listed under Networking. Tried to turn on CloudWatch logging as we have problems with K8s API, but it fails with the message:

InvalidRequestException The subnet ID 'subnet-xxxx892f' does not exist (Service: AmazonEC2; Status Code: 400; Error Code: InvalidSubnetID.NotFound; Request ID: ce958c55-3c8a-4154-8cfa-25891bf8ca3f)

EDIT: why was deleting the subnet allowed, when this possibly broke the whole EKS cluster? K8s API i/o timeout issues, nodes not joining cluster, resolving issues, unability to turn on logging. And no way to solve it, even via support.

thorro on 26 Apr 2020

We are facing the same issue. We are trying to increase the number of subnets on the EKS cluster an this is not possible without cluster recreation. This scenario has been tested by AWS support too with could formation and they reach the same result I did with terraform. The only solution is to change the core behavior of EKS to update the number of subnets on the fly. We can't afford to redeploy an entire production cluster just because EKS does not support a "normal action" to update the numbers of subnets that are using. Please provide feedback when this functionality will be implemented cause production clusters are affected by this missing.

connatix-cradulescu on 1 May 2020

👍11

+1 this is very much needed is there any timeline on the addition of subnets without recreation ?
Is there any workaround for it ?

manikandanmuthu44 on 19 May 2020

👍1

Just don't delete any of the subnets (it will let you if there are no used IPs), or you will be forced to recreate an entire EKS cluster.

thorro on 19 May 2020

@thorro So we cannot start new pods in the cluster since there are not enough IPs. So in the same VPC we have created some more subnets is it possible to use those subnets for solving this use-case without recreating the cluster ?

manikandanmuthu44 on 19 May 2020

@manikandanmuthu44 we had no problems with that, just tag them.

thorro on 19 May 2020

We also ran out of IP addresses due to initially choosing too smaller subnets for EKS. We created bigger subnets, tagged them, and untagged the smaller subnets, and one-by-one terminated the nodes waiting for each new node to accept pods before going onto the next one. It solved our problem in production with no downtime.

The only issue is the control plane is stuck with the old subnets, meaning we can't tidy them up.

tomfotherby on 19 May 2020

👍2

@tomfotherby Thanks whatever you said worked out we did similar to it, created new bigger subnets and tagged them and booted nodes got associated with cluster and pods started picking up IPs in the new bigger subnet

manikandanmuthu44 on 1 Jun 2020

👍2

The behaviour of not allowing subnets to be modified after creation is highly undesirable.

$ aws eks update-cluster-config --name eks-cluster --resources-vpc-config subnetIds=subnet-abc,subnet-def,subnet-123

An error occurred (InvalidParameterException) when calling the UpdateClusterConfig operation: subnetIds and securityGroupIds cannot be updated.

ab77 on 11 Jun 2020

😕6

In my case a one character typo on terraform resulted in a cluster being created in the wrong AZ:
availability_zone = var.azs[2]
I have tried a couple of things but in the end it was faster to rebuild the cluster.
It would appear there is no easy way to edit the subnets on EKS.

fergaldonlon on 12 Jun 2020

+1

shikloshi on 27 Aug 2020

+1 for EKS feature

I was able to add AZ/subnets manually but it was a hack to not be able to configure the EKS cluster the same.

nitrag on 3 Sep 2020

👀2

+1

@nitrag how did you accomplish this? you were able to add subnets to an existing EKS cluster?

Ticiano-mw on 3 Sep 2020

@Ticiano-mw

Added subnets, added route table association. Updated ASG's with additional vpc_zone_identifier's (subnets).

nitrag on 3 Sep 2020

👍1

@Ticiano-mw how did you accomplish this?

sindhujaaitha07 on 3 Sep 2020

👀1

@Ticiano-mw

Added subnets, added route table association. Updated ASG's with additional vpc_zone_identifier's (subnets).

But there are no ASGs for the EKS control plane hosts

Ticiano-mw on 4 Sep 2020

Does anyone has a workaround for this? None of the solutions presented worked for me...
Btw, this issue has a long time, AWS should address this asap

lfpalacios on 11 Sep 2020

😕1

Unfortunately it is necessary to recreate the cluster to update the subnets.

This is very bad, I need to include a subnet and I will need to recreate my 5 clusters

fndiaz on 12 Sep 2020

We used to have /20 subnets but ip pool was not enough, we created extra subnets (/19), routing and ASG with EKS required tags, it works very well, the new subnets doesn’t have to be registered to control plane.

dogzzdogzz on 12 Sep 2020

I was in a similar position recently. A single worker group in private subnets without enough IPs in the CIDR.

FWIW here are the steps I took with the EKS module:

Added secondary VPC CIDR and three additional private subnets to use that range. Key at this stage was hard-coding the subnets IDs that are passed to the EKS module (as mentioned by cazter above) otherwise terraform wants to destroy and recreate the existing autoscaling group(s) which will destroy the worker nodes too.
Added a second worker group in the new subnets that were created in step 1. Again, the subnets IDs for these need to be hardcoded when passed to the EKS module.
Disable cluster-autoscaler on the original worker group. (This is done with the autoscaling_enabled = false option in the EKS module)
Gracefully drain all the nodes in the original worker group, cluster-autoscaler will bring up new nodes in the new group as needed and pods will migrate to them.
Scale the original group to zero (set asg_min_size=0 and asg_max_size=0, apply and then delete the nodes manually) See note in FAQ
At this point the hardcoded subnets can be swapped out for the appropriate resource attributes and the terraform code is 'clean' of the nasty hard-coding.

This doesn't change the fact that only the original subnets are registered with the control plane of course, but as others have said it does give fully operational pods and those pods are accessible by the control plane. Plus the terraform code is still sane when you're finished.

There is however a big gotcha, custom apiservices no longer work as the pods that back them are now in the new subnets and the control plane can't discover them. This breaks autoscaling for example because HorizontalPodAutoscaler resources rely on metrics-server.

$ kubectl get apiservice
NAME                                   SERVICE                                      AVAILABLE                      AGE
[...]
v1beta1.metrics.k8s.io                 kube-system/metrics-server                   False (FailedDiscoveryCheck)   104d

I worked around this by bringing back up the original workergroup (in the original subnets) and using taints/tolerations and nodeselectors to pin the metrics-server pods to the workers in that group. Unfortunately bringing back that worker group also means going back to hardcoded subnet definitions. So it's a workable but very undesirable solution.

$ kubectl get apiservice
NAME                                   SERVICE                                      AVAILABLE                      AGE
[...]
v1beta1.metrics.k8s.io                 kube-system/metrics-server                   True                           104d

As everyone else has already said, this needs fixing properly.

mf-lit on 14 Sep 2020

👍8

+1

blacabexg on 2 Oct 2020

I can confirm that control plane restart has fixed "Address not allowed" issue.
Assuming new subnets in secondary CIDR have have proper tags kubernetes.io/cluster/mycluster,Value=shared.
Enable custom CNI config in aws-node daemonset:

        - "name": "AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG"
          "value": "true"
        - "name": "ENI_CONFIG_LABEL_DEF"
          "value": "failure-domain.beta.kubernetes.io/zone"

apply ENIconfig configmap:

apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
  name: us-east-1a
spec:
  securityGroups:
    - sg-0f50a320<id>
  subnet: subnet-03bfe23a<id>
...
# and similar for other AZs

Roll worker nodes and see that pods are now being assigned into new subnet (Nodes subnets remain "old" ones, no reason to change subnets in ASG).

metrics-server-b6866958f-q9fk2             1/1     Running   0          20h   10.7.13.140   ip-10-7-2-113.ec2.internal   <none>           <none>
metrics-server-b6866958f-w7cfq             1/1     Running   0          20h   10.7.10.132   ip-10-7-1-157.ec2.internal   <none>           <none>

Check API resources status and find it's failed https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Address is not allowed:

k describe apiservices.apiregistration.k8s.io v1beta1.metrics.k8s.io
...
    Message:               failing or missing response from https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Get https://10.7.13.71:8443/apis/metrics.k8s.io/v1beta1: Address is not allowed
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

After that simply update EKS version from version 1.17 to 1.18 and check status after updating:

k describe apiservices.apiregistration.k8s.io v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
...
Status:
  Conditions:
    Last Transition Time:  2020-10-15T19:26:42Z
    Message:               all checks passed
    Reason:                Passed
    Status:                True
    Type:                  Available
Events:                    <none>

I suppose EKS control plan reads subnets tags on startup time and creates a list to pass to --proxy-cidr-whitelist parameter. Adding new subnets while cluster is running doesn't trigger API server config reload so new cidr are not in --proxy-cidr-whitelist list.
On the other hand version update causes EKS control plane to restart so it regenerates allowed IP ranges to connect/proxy by control plane.

This behavior has been tested on at least 3 different clusters.
Unfortunately there's no way to restart EKS control plan on demand so version upgrade is the only way to do it.

savealive on 16 Oct 2020

I just upgraded my EKS Managed Node Groups from 1.17 -> 1.18. During the upgrade process, My node groups went degraded because one of the AZ no longer had instance type availability. I could not change the node group network settings for this selection. I tried to adjust this through the ASG directly which did not reflect on the control plane side. This really needs to be fixed. An upgrade process should not be this painful for a managed service.

ten-lac on 29 Oct 2020

👍6

+1

srfaytkn on 1 Dec 2020

Same problem, HPA is completely broken when custom ENI is used after enabled secondary CIDR block for existing EKS cluster, we are in 1.18 already thus no workaround for us to force control plane restart.

moonlsd on 18 Dec 2020

Containers-roadmap: [EKS] [request]: Add/Delete/Update Subnets Registered with the Control Plane

Most helpful comment

All 58 comments

Related issues