Building the EKS cluster times out due to the script not finding 'wget'
module.my-cluster.null_resource.wait_for_cluster[0]: Still creating... [4m50s elapsed]
module.my-cluster.null_resource.wait_for_cluster[0] (local-exec): /bin/sh: wget: command not found
module.my-cluster.null_resource.wait_for_cluster[0] (local-exec): /bin/sh: wget: command not found
module.my-cluster.null_resource.wait_for_cluster[0]: Still creating... [5m0s elapsed]
module.my-cluster.null_resource.wait_for_cluster[0] (local-exec): TIMEOUT
Error: Error running command 'for i in `seq 1 60`; do wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1': exit status 1. Output: /bin/sh: wget: command not found
/bin/sh: wget: command not found
Run this script
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
name = "tennis-eks-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-2a", "us-east-2b", "us-east-2c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = true
tags = {
Terraform = "true"
Environment = "dev"
}
}
data "aws_eks_cluster" "cluster" {
name = module.my-cluster.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.my-cluster.cluster_id
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
version = "~> 1.9"
}
module "my-cluster" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "tennis-eks-cluster"
cluster_version = "1.15"
subnets = module.vpc.private_subnets
vpc_id = module.vpc.vpc_id
worker_groups = [
{
instance_type = "m4.large"
asg_max_size = 5
}
]
}
The cluster should be built and the process end gracefully.
Terraform v0.12.19
+ provider.aws v2.57.0
+ provider.kubernetes v1.11.1
+ provider.local v1.4.0
+ provider.null v2.1.2
+ provider.random v2.2.1
+ provider.template v2.1.2
Not a bug.
Due to the way EKS service works we have to ping the new kubernetes endpoint until it responds correctly. The terraform kubernetes provider does not do retries and instead dies instantly. The module thus has a null_resource block that runs wget in a loop by default.
You have a few options here:
manage_aws_auth = falsewget into your deployment environmentwait_for_cluster_cmd and wait_for_cluster_interpreter variables.It still seems like a bug. If the module is going to use wget it should ensure its there (or do something else). Otherwise the result is the same, the module breaks.
$0.02
@gamename what do you suggest ? The module use wget by default, it's written in docs and variables spacs https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/variables.tf#L201-L211
@barryib Good question. Try adding something in the assumptions section that indicates wget be supported in the EKS node. Because if it isn't, then timing out is the expected behavior.
Getting hit with this currently as well. Would it make sense to add a local-exec that checks if wget is installed? My stack is stuck creating and just hit the 40th minute mark for this check.
I added wget shortly after I saw what was happening, but it's been 37 minutes since then
Maybe. But how to ensure easily (without putting depends on everywhere) ? It will be the first action to run in terraform ?
I think writing an explicit requirement for wget in docs or asking user to change wait_for_cluster_cmd to fit his need is enough.
But anyway, I'll happy to review any PR to fix this.
using curl instead of wget:
wait_for_cluster_cmd = "for i in `seq 1 60`; do curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1"
running into same issue
Please, use curl or whatever you want to call an http endpoint. There is a variable to help you set what you want. See https://github.com/terraform-aws-modules/terraform-aws-eks/issues/829#issuecomment-626132066.
Keep this issue open untill we improve the doc to highlight this "requirement".
Same error here, thanks to @twzhangyang it works. Edit your eks-cluster.tf and add the wait_for_cluster_cmd overrided var, here is mine:
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = local.cluster_name
subnets = module.vpc.private_subnets
tags = {
Environment = "training"
GithubRepo = "terraform-aws-eks"
GithubOrg = "terraform-aws-modules"
}
vpc_id = module.vpc.vpc_id
worker_groups = [
{
name = "worker-group-1"
instance_type = "t2.small"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
},
{
name = "worker-group-2"
instance_type = "t2.medium"
additional_userdata = "echo foo bar"
additional_security_group_ids = [aws_security_group.worker_group_mgmt_two.id]
asg_desired_capacity = 1
},
]
wait_for_cluster_cmd = "for i in `seq 1 60`; do curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1"
cluster_version = "1.17"
}
data "aws_eks_cluster" "cluster" {
name = module.eks.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks.cluster_id
}
thanks @PierrickMartos and @twzhangyang
I tried to create the cluster as you mentioned but it fails with below error.
Added: the below line in my EKS cluster creation.
wait_for_cluster_cmd = "for i in seq 1 60; do curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1"
Error details:
Error: Error running command 'for i in seq 1 60; do curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1': exec: "/bin/sh": file does not exist. Output:
Version details: I use Windows 10 as my environment and connect to AWS for EKS provisioning.
Terraform v0.12.26
provider.aws v2.70.0
provider.kubernetes v1.12.0
provider.local v1.4.0
provider.null v2.1.2
provider.random v2.3.0
provider.template v2.1.2
@BalajiSivarajRajan set wait_for_cluster_interpreter to match your interpreter https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/variables.tf#L206.
By default, you don't have /bin/sh in windows. See FAQ
Most helpful comment
using curl instead of wget: