Terraform-aws-eks: Worker nodes are not joining the cluster using version 2.3.0

Created on 21 Mar 2019  路  6Comments  路  Source: terraform-aws-modules/terraform-aws-eks

I have issues

While creating a new cluster using the version 2.3.0 of the module and the AMI ami-01e08d22b9439c15a for the worker nodes, not a single worker node joins the cluster.

I'm submitting a...

  • [x] bug report
  • [ ] feature request
  • [ ] support request
  • [ ] kudos, thank you, warm fuzzy

What is the current behaviour?

  • The EKS cluster is created, the autoscaling group, the lunch configuration and nodes are created. However, when running kubectl get nodes there are no nodes at all.

  • Running cat /var/lib/kubelet/kubeconfig within any of the worker nodes, we can see a miss-configuration issue:

apiVersion: v1
clusters:
- cluster:
    certificate-authority: /etc/kubernetes/pki/ca.crt
    server: https://<Skipping this part>.sk1.eu-west-1.eks.amazonaws.com
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubelet
  name: kubelet
current-context: kubelet
kind: Config
preferences: {}
users:
- name: kubelet
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      args:
      - token
      - -i
      - --enable-docker-bridge
      command: /usr/bin/aws-iam-authenticator
      env: null

Instead of the cluster name, after the -i option we are getting --enable-docker-bridge

If this is a bug, how to reproduce? Please include a code sample if relevant.

Create a new cluster using the 2.3.0 version of the module and the AMI ami-01e08d22b9439c15a and run:

kubectl get nodes

SSH into one of the worker nodes and run:

cat /var/lib/kubelet/kubeconfig

What's the expected behaviour?

Worker nodes joining the cluster properly, just like they do with the version 2.2.1 of the module.

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

  • Affected module version: 2.3.0
  • OS: Mac OS and Linux Alpine
  • Terraform version: 0.11.13.

Any other relevant info

Most helpful comment

I see.

Maybe we just replace the enable-docker-bridge parameter with just a general parameter like bootstrap_extra_args that defaults to "". This is safer for AMI changes, will solve this issue and means people can change the docker bridge setting also.

@max-rocket-internet Huge +1 on this. It opens up a lot more options and does some good future-proofing for stuff like this too.

All 6 comments

That's strange. I think it's related to https://github.com/terraform-aws-modules/terraform-aws-eks/pull/302

What is your AMI release? Perhaps in your AMI version the script /etc/eks/bootstrap.sh doesn't have support for --enable-docker-bridge?

I am working with the AMI amazon-eks-node-1.11-v20190109, and it doesn't have indeed the --enable-docker-bridge arg.
The thing is that with any newer version I run into this issue: https://github.com/awslabs/amazon-eks-ami/issues/193

I see.

Maybe we just replace the enable-docker-bridge parameter with just a general parameter like bootstrap_extra_args that defaults to "". This is safer for AMI changes, will solve this issue and means people can change the docker bridge setting also.

FYI @michaelmccord

I am working with the AMI amazon-eks-node-1.11-v20190109, and it doesn't have indeed the --enable-docker-bridge arg.
The thing is that with any newer version I run into this issue: awslabs/amazon-eks-ami#193

@armandorvila v1.11-v20190211 doesn't have the ulimit bug. The ami id for us-east-1 is ami-0c5b63ec54dd3fc38. Not sure about the other regions.

I see.

Maybe we just replace the enable-docker-bridge parameter with just a general parameter like bootstrap_extra_args that defaults to "". This is safer for AMI changes, will solve this issue and means people can change the docker bridge setting also.

@max-rocket-internet Huge +1 on this. It opens up a lot more options and does some good future-proofing for stuff like this too.

Was this page helpful?
0 / 5 - 0 ratings