What happened?
Using the current version of eksctl, 0.5.2, when creating two nodegroups using a config file, both nodegroups are created but just the first one joins the cluster. When checking the ConfigMap aws-auth, just one nodegroup is there.
What you expected to happen?
I would expect that all the nodegroups are added to the aws-auth ConfigMap, and therefore show up with kubectl get nodes.
How to reproduce it?
Using a 1.13 EKS cluster named development, use this config file, test.yaml to create the nodegroups with eksctl create nodegroup -f test.yaml.
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: development
region: eu-west-1
version: "1.13"
nodeGroups:
- name: ng-1
instanceType: t3.small
desiredCapacity: 2
amiFamily: AmazonLinux2
privateNetworking: true
- name: ng-2
instanceType: t3.small
desiredCapacity: 2
amiFamily: AmazonLinux2
privateNetworking: true
Anything else we need to know?
OS: macOS
eksctl: Installed using brew.
This issue seems to be introduced between 0.4.1 and 0.5.2, as with 0.4.1 works fine.
Versions
$ eksctl version
[鈩筣 version.Info{BuiltAt:"", GitCommit:"", GitTag:"0.5.2"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T12:36:28Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.10-eks-5ac0f1", GitCommit:"5ac0f1d9ab2c254ea2b0ce3534fd72932094c6e1", GitTreeState:"clean", BuildDate:"2019-08-20T22:39:46Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}
$ go version
go version go1.13 darwin/amd64
Logs
eksctl 0.5.2$ eksctl create nodegroup -f test.yaml
[鈩筣 using region eu-west-1
[鈩筣 nodegroup "ng-1" will use "ami-0199284372364b02a" [AmazonLinux2/1.13]
[鈩筣 nodegroup "ng-2" will use "ami-0199284372364b02a" [AmazonLinux2/1.13]
[鈩筣 2 nodegroups (ng-1, ng-2) were included (based on the include/exclude rules)
[鈩筣 will create a CloudFormation stack for each of 2 nodegroups in cluster "development"
[鈩筣 2 parallel tasks: { create nodegroup "ng-1", create nodegroup "ng-2" }
[鈩筣 building nodegroup stack "eksctl-development-nodegroup-ng-1"
[鈩筣 building nodegroup stack "eksctl-development-nodegroup-ng-2"
[鈩筣 --nodes-min=2 was set automatically for nodegroup ng-2
[鈩筣 --nodes-max=2 was set automatically for nodegroup ng-2
[鈩筣 --nodes-min=2 was set automatically for nodegroup ng-1
[鈩筣 --nodes-max=2 was set automatically for nodegroup ng-1
[鈩筣 deploying stack "eksctl-development-nodegroup-ng-2"
[鈩筣 deploying stack "eksctl-development-nodegroup-ng-1"
[鈩筣 adding role "arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-N75PLALYFBL1" to auth ConfigMap
[鈩筣 nodegroup "ng-1" has 0 node(s)
[鈩筣 waiting for at least 2 node(s) to become ready in "ng-1"
[鈩筣 nodegroup "ng-1" has 2 node(s)
[鈩筣 node "ip-10-3-43-18.eu-west-1.compute.internal" is ready
[鈩筣 node "ip-10-3-6-25.eu-west-1.compute.internal" is ready
$ eksctl get nodegroups --cluster development
CLUSTER NODEGROUP CREATED MIN SIZE MAX SIZE DESIRED CAPACITY INSTANCE TYPE IMAGE ID
development ng-1 2019-09-08T19:41:02Z 2 2 2 t3.small ami-0199284372364b02a
development ng-2 2019-09-08T19:41:02Z 2 2 2 t3.small ami-0199284372364b02a
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-3-43-18.eu-west-1.compute.internal Ready <none> 93s v1.13.8-eks-cd3eb0
ip-10-3-6-25.eu-west-1.compute.internal Ready <none> 93s v1.13.8-eks-cd3eb0
$ kubectl get configmap -n kube-system aws-auth -o yaml
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-N75PLALYFBL1
username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
creationTimestamp: "2019-09-08T19:45:10Z"
name: aws-auth
namespace: kube-system
resourceVersion: "918"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: 2b278b81-d271-11e9-987b-0a80449407fe
eksctl 0.4.1$ eksctl create nodegroup -f test.yaml
[鈩筣 using region eu-west-1
[鈩筣 nodegroup "ng-1" will use "ami-00ac2e6b3cb38a9b9" [AmazonLinux2/1.13]
[鈩筣 nodegroup "ng-2" will use "ami-00ac2e6b3cb38a9b9" [AmazonLinux2/1.13]
[鈩筣 2 nodegroups (ng-1, ng-2) were included
[鈩筣 will create a CloudFormation stack for each of 2 nodegroups in cluster "development"
[鈩筣 2 parallel tasks: { create nodegroup "ng-1", create nodegroup "ng-2" }
[鈩筣 building nodegroup stack "eksctl-development-nodegroup-ng-2"
[鈩筣 building nodegroup stack "eksctl-development-nodegroup-ng-1"
[鈩筣 --nodes-min=2 was set automatically for nodegroup ng-1
[鈩筣 --nodes-max=2 was set automatically for nodegroup ng-1
[鈩筣 --nodes-min=2 was set automatically for nodegroup ng-2
[鈩筣 --nodes-max=2 was set automatically for nodegroup ng-2
[鈩筣 deploying stack "eksctl-development-nodegroup-ng-1"
[鈩筣 deploying stack "eksctl-development-nodegroup-ng-2"
[鈩筣 adding role "arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-2RX93EXNE729" to auth ConfigMap
[鈩筣 nodegroup "ng-1" has 0 node(s)
[鈩筣 waiting for at least 2 node(s) to become ready in "ng-1"
[鈩筣 nodegroup "ng-1" has 2 node(s)
[鈩筣 node "ip-10-3-20-98.eu-west-1.compute.internal" is ready
[鈩筣 node "ip-10-3-47-125.eu-west-1.compute.internal" is ready
[鈩筣 adding role "arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-14X8P4JO00HZO" to auth ConfigMap
[鈩筣 nodegroup "ng-2" has 0 node(s)
[鈩筣 waiting for at least 2 node(s) to become ready in "ng-2"
[鈩筣 nodegroup "ng-2" has 2 node(s)
[鈩筣 node "ip-10-3-41-213.eu-west-1.compute.internal" is ready
[鈩筣 node "ip-10-3-7-52.eu-west-1.compute.internal" is ready
[鉁擼 created 2 nodegroup(s) in cluster "development"
[鈩筣 checking security group configuration for all nodegroups
[鈩筣 all nodegroups have up-to-date configuration
$ eksctl get nodegroups --cluster development
CLUSTER NODEGROUP CREATED MIN SIZE MAX SIZE DESIRED CAPACITY INSTANCE TYPE IMAGE ID
development ng-1 2019-09-08T19:50:26Z 2 2 2 t3.small ami-00ac2e6b3cb38a9b9
development ng-2 2019-09-08T19:50:26Z 2 2 2 t3.small ami-00ac2e6b3cb38a9b9
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-3-20-98.eu-west-1.compute.internal Ready <none> 2m20s v1.13.7-eks-c57ff8
ip-10-3-41-213.eu-west-1.compute.internal Ready <none> 2m7s v1.13.7-eks-c57ff8
ip-10-3-47-125.eu-west-1.compute.internal Ready <none> 2m19s v1.13.7-eks-c57ff8
ip-10-3-7-52.eu-west-1.compute.internal Ready <none> 2m2s v1.13.7-eks-c57ff8
$ kubectl get configmap -n kube-system aws-auth -o yaml
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-2RX93EXNE729
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::930347582273:role/eksctl-development-nodegroup-ng-NodeInstanceRole-14X8P4JO00HZO
username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
creationTimestamp: "2019-09-08T19:45:10Z"
name: aws-auth
namespace: kube-system
resourceVersion: "1951"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: 2b278b81-d271-11e9-987b-0a80449407fe
From the logs, it appears that you terminated the command before it could add ng-2's role ARN to the aws-auth ConfigMap. Could you please confirm this?
eksctl waits for the stacks to become ready and serially adds the instance role ARN to the ConfigMap, so if you terminate the command before it was able to add the role ARN, the node group won't join the cluster.
I created a new cluster using your ClusterConfig file with eksctl 0.5.2 and was able to have both node groups join the cluster (I let the command run to completion).
I am experiencing this also.
Just today upgraded eksctl to 0.5.2 and have a cluster yaml with three nodegroups defined. In the eksctl create nodegroup all three nodegroups are "created" and ec2 instances are made, but only the first one has the "added to the ConfigMap" line in the logs, and it's the only one that shows in the cluster (same as jfusterm's example above).
I'm not terminating the eksctl create nodegroup command early, but agree it does appear to be terminating early for some reason.
@cPu1 I have the same behaviour as @sponrad, I'm not terminating the command, it just stops there and I'm not getting any error whatsoever.
Executing eksctl create nodegroup -f test.yaml -v 4 I'm not getting anything either.
...
2019-09-10T06:13:39+02:00 [鈻禲 event = watch.Event{Type:"MODIFIED", Object:(*v1.Node)(0xc0000d4dc0)}
2019-09-10T06:13:39+02:00 [鈻禲 node "ip-10-3-11-232.eu-west-1.compute.internal" is ready in "ng-1"
2019-09-10T06:13:40+02:00 [鈻禲 event = watch.Event{Type:"MODIFIED", Object:(*v1.Node)(0xc0000d5080)}
2019-09-10T06:13:40+02:00 [鈻禲 node "ip-10-3-37-47.eu-west-1.compute.internal" seen in "ng-1", but not ready yet
2019-09-10T06:13:40+02:00 [鈻禲 node = v1.Node{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ip-10-3-37-47.eu-west-1.compute.internal", GenerateName:"", Namespace:"", SelfLink:"/api/v1/nodes/ip-10-3-37-47.eu-west-1.compute.internal", UID:"558e710e-d381-11e9-97e1-0a1e0ff00f04", ResourceVersion:"3938", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63703685605, loc:(*time.Location)(0x5f166e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"alpha.eksctl.io/cluster-name":"development", "alpha.eksctl.io/instance-id":"i-07f5c2d4a22261c12", "alpha.eksctl.io/nodegroup-name":"ng-1", "beta.kubernetes.io/arch":"amd64", "beta.kubernetes.io/instance-type":"t3.small", "beta.kubernetes.io/os":"linux", "failure-domain.beta.kubernetes.io/region":"eu-west-1", "failure-domain.beta.kubernetes.io/zone":"eu-west-1c", "kubernetes.io/hostname":"ip-10-3-37-47.eu-west-1.compute.internal"}, Annotations:map[string]string{"node.alpha.kubernetes.io/ttl":"0", "volumes.kubernetes.io/controller-managed-attach-detach":"true"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.NodeSpec{PodCIDR:"", ProviderID:"aws:///eu-west-1c/i-07f5c2d4a22261c12", Unschedulable:false, Taints:[]v1.Taint{v1.Taint{Key:"node.kubernetes.io/not-ready", Value:"", Effect:"NoSchedule", TimeAdded:(*v1.Time)(nil)}, v1.Taint{Key:"node.kubernetes.io/not-ready", Value:"", Effect:"NoExecute", TimeAdded:(*v1.Time)(0xc0008d4040)}}, ConfigSource:(*v1.NodeConfigSource)(nil), DoNotUse_ExternalID:""}, Status:v1.NodeStatus{Capacity:v1.ResourceList{"attachable-volumes-aws-ebs":resource.Quantity{i:resource.int64Amount{value:25, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"25", Format:"DecimalSI"}, "cpu":resource.Quantity{i:resource.int64Amount{value:2, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"2", Format:"DecimalSI"}, "ephemeral-storage":resource.Quantity{i:resource.int64Amount{value:21462233088, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"20959212Ki", Format:"BinarySI"}, "hugepages-1Gi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "hugepages-2Mi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "memory":resource.Quantity{i:resource.int64Amount{value:2050457600, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"", Format:"BinarySI"}, "pods":resource.Quantity{i:resource.int64Amount{value:11, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"11", Format:"DecimalSI"}}, Allocatable:v1.ResourceList{"attachable-volumes-aws-ebs":resource.Quantity{i:resource.int64Amount{value:25, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"25", Format:"DecimalSI"}, "cpu":resource.Quantity{i:resource.int64Amount{value:2, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"2", Format:"DecimalSI"}, "ephemeral-storage":resource.Quantity{i:resource.int64Amount{value:19316009748, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"19316009748", Format:"DecimalSI"}, "hugepages-1Gi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "hugepages-2Mi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "memory":resource.Quantity{i:resource.int64Amount{value:1945600000, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"", Format:"BinarySI"}, "pods":resource.Quantity{i:resource.int64Amount{value:11, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"11", Format:"DecimalSI"}}, Phase:"", Conditions:[]v1.NodeCondition{v1.NodeCondition{Type:"MemoryPressure", Status:"False", LastHeartbeatTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685615, loc:(*time.Location)(0x5f166e0)}}, LastTransitionTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685605, loc:(*time.Location)(0x5f166e0)}}, Reason:"KubeletHasSufficientMemory", Message:"kubelet has sufficient memory available"}, v1.NodeCondition{Type:"DiskPressure", Status:"False", LastHeartbeatTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685615, loc:(*time.Location)(0x5f166e0)}}, LastTransitionTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685605, loc:(*time.Location)(0x5f166e0)}}, Reason:"KubeletHasNoDiskPressure", Message:"kubelet has no disk pressure"}, v1.NodeCondition{Type:"PIDPressure", Status:"False", LastHeartbeatTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685615, loc:(*time.Location)(0x5f166e0)}}, LastTransitionTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685605, loc:(*time.Location)(0x5f166e0)}}, Reason:"KubeletHasSufficientPID", Message:"kubelet has sufficient PID available"}, v1.NodeCondition{Type:"Ready", Status:"False", LastHeartbeatTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685615, loc:(*time.Location)(0x5f166e0)}}, LastTransitionTime:v1.Time{Time:time.Time{wall:0x0, ext:63703685605, loc:(*time.Location)(0x5f166e0)}}, Reason:"KubeletNotReady", Message:"runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"}}, Addresses:[]v1.NodeAddress{v1.NodeAddress{Type:"InternalIP", Address:"10.3.37.47"}, v1.NodeAddress{Type:"Hostname", Address:"ip-10-3-37-47.eu-west-1.compute.internal"}, v1.NodeAddress{Type:"InternalDNS", Address:"ip-10-3-37-47.eu-west-1.compute.internal"}}, DaemonEndpoints:v1.NodeDaemonEndpoints{KubeletEndpoint:v1.DaemonEndpoint{Port:10250}}, NodeInfo:v1.NodeSystemInfo{MachineID:"ec20412a0093f3b815b3e554a4de4f62", SystemUUID:"EC20412A-0093-F3B8-15B3-E554A4DE4F62", BootID:"16cae243-e079-41d6-8c3b-85c02e5b1108", KernelVersion:"4.14.133-113.112.amzn2.x86_64", OSImage:"Amazon Linux 2", ContainerRuntimeVersion:"docker://18.6.1", KubeletVersion:"v1.13.8-eks-cd3eb0", KubeProxyVersion:"v1.13.8-eks-cd3eb0", OperatingSystem:"linux", Architecture:"amd64"}, Images:[]v1.ContainerImage(nil), VolumesInUse:[]v1.UniqueVolumeName(nil), VolumesAttached:[]v1.AttachedVolume(nil), Config:(*v1.NodeConfigStatus)(nil)}}
2019-09-10T06:13:45+02:00 [鈻禲 event = watch.Event{Type:"MODIFIED", Object:(*v1.Node)(0xc000541600)}
2019-09-10T06:13:45+02:00 [鈻禲 node "ip-10-3-37-47.eu-west-1.compute.internal" is ready in "ng-1"
2019-09-10T06:13:45+02:00 [鈩筣 nodegroup "ng-1" has 2 node(s)
2019-09-10T06:13:45+02:00 [鈩筣 node "ip-10-3-11-232.eu-west-1.compute.internal" is ready
2019-09-10T06:13:45+02:00 [鈩筣 node "ip-10-3-37-47.eu-west-1.compute.internal" is ready
@jfusterm @sponrad this has been fixed in 0.5.3: https://github.com/weaveworks/eksctl/releases/tag/0.5.3. Please try it out.
I initially tried your example config with eksctl create cluster and not eksctl create nodegroup, which is why I couldn't reproduce it as this bug only affected eksctl create nodegroup.
@cPu1 that fixed it for me. eksctl create nodegroup created my three nodegroups and added them all to the cluster.
Thanks so much!
Thanks @cPu1 for your quick fix! It's working right now.
Most helpful comment
@jfusterm @sponrad this has been fixed in
0.5.3: https://github.com/weaveworks/eksctl/releases/tag/0.5.3. Please try it out.I initially tried your example config with
eksctl create clusterand noteksctl create nodegroup, which is why I couldn't reproduce it as this bug only affectedeksctl create nodegroup.