Kops: panic: Terraform resource names cannot start with a digit. This is a bug in Kops, please report this in a GitHub Issue. Name: 1.etcd-events.k8s-cs.domain.net

Created on 24 Sep 2020  路  10Comments  路  Source: kubernetes/kops

Version 1.18.1 (git-453d7d96be)

AWS

kops create cluster \
--cloud=aws \
--zones=us-west-2a \
--name=k8s-cs.domain.net \
--state=s3://domaininfrastructurestate/kops/cs/ \
--dns-zone=k8s-cs.domain.net \
--node-size=t3a.small \
--node-tenancy=default \
--node-volume-size=50 \
--node-count=3 \
--master-size=t3a.small \
--master-volume-size=50 \
--master-zones=us-west-2a \
--master-count=3 \
--out=. \
--target=terraform \
--yes \
--image=ami-028e52edfeb33adb2
W0924 16:33:48.893149   30386 create_cluster.go:771] Running with masters in the same AZs; redundancy will be reduced
I0924 16:33:50.377267   30386 subnets.go:184] Assigned CIDR 172.20.32.0/19 to subnet us-west-2a
I0924 16:33:54.964514   30386 create_cluster.go:1537] Using SSH public key: /home/demi4/.ssh/id_rsa.pub
I0924 16:34:08.385316   30386 executor.go:103] Tasks: 0 done / 95 total; 47 can run
I0924 16:34:08.397555   30386 dnszone.go:242] Check for existing route53 zone to re-use with name "k8s-cs.domain.net"
I0924 16:34:09.249425   30386 dnszone.go:249] Existing zone "k8s-cs.domain.net." found; will configure TF to reuse
I0924 16:34:10.792010   30386 vfs_castore.go:590] Issuing new certificate: "apiserver-aggregator-ca"
I0924 16:34:10.831807   30386 vfs_castore.go:590] Issuing new certificate: "etcd-clients-ca"
I0924 16:34:10.900833   30386 vfs_castore.go:590] Issuing new certificate: "ca"
I0924 16:34:10.919639   30386 vfs_castore.go:590] Issuing new certificate: "etcd-manager-ca-main"
I0924 16:34:11.035820   30386 vfs_castore.go:590] Issuing new certificate: "etcd-peers-ca-events"
I0924 16:34:11.106013   30386 vfs_castore.go:590] Issuing new certificate: "etcd-peers-ca-main"
I0924 16:34:11.226926   30386 vfs_castore.go:590] Issuing new certificate: "etcd-manager-ca-events"
I0924 16:34:16.248941   30386 executor.go:103] Tasks: 47 done / 95 total; 26 can run
I0924 16:34:18.305484   30386 vfs_castore.go:590] Issuing new certificate: "master"
I0924 16:34:18.365115   30386 vfs_castore.go:590] Issuing new certificate: "apiserver-aggregator"
I0924 16:34:18.463993   30386 vfs_castore.go:590] Issuing new certificate: "kops"
I0924 16:34:18.493709   30386 vfs_castore.go:590] Issuing new certificate: "kubelet"
I0924 16:34:18.506919   30386 vfs_castore.go:590] Issuing new certificate: "kube-controller-manager"
I0924 16:34:18.517012   30386 vfs_castore.go:590] Issuing new certificate: "kube-scheduler"
I0924 16:34:18.523661   30386 vfs_castore.go:590] Issuing new certificate: "kubecfg"
I0924 16:34:18.533129   30386 vfs_castore.go:590] Issuing new certificate: "kubelet-api"
I0924 16:34:18.602225   30386 vfs_castore.go:590] Issuing new certificate: "apiserver-proxy-client"
I0924 16:34:18.682028   30386 vfs_castore.go:590] Issuing new certificate: "kube-proxy"
I0924 16:34:22.575969   30386 executor.go:103] Tasks: 73 done / 95 total; 18 can run
I0924 16:34:24.249151   30386 executor.go:103] Tasks: 91 done / 95 total; 4 can run
I0924 16:34:24.249770   30386 executor.go:103] Tasks: 95 done / 95 total; 0 can run
panic: Terraform resource names cannot start with a digit. This is a bug in Kops, please report this in a GitHub Issue. Name: 1.etcd-events.k8s-cs.domain.net

goroutine 1 [running]:
k8s.io/kops/upup/pkg/fi/cloudup/terraform.tfSanitize(0xc000bf2570, 0x23, 0x3d027e5, 0xe)
        /go/src/k8s.io/kops/upup/pkg/fi/cloudup/terraform/target.go:104 +0x302
k8s.io/kops/upup/pkg/fi/cloudup/terraform.(*TerraformTarget).finish012(0xc00056e500, 0xc000a077d0, 0x0, 0x34d2ae0)
        /go/src/k8s.io/kops/upup/pkg/fi/cloudup/terraform/target_0_12.go:61 +0x6b5
k8s.io/kops/upup/pkg/fi/cloudup/terraform.(*TerraformTarget).Finish(0xc00056e500, 0xc000a077d0, 0xa, 0xc000625700)
        /go/src/k8s.io/kops/upup/pkg/fi/cloudup/terraform/target.go:200 +0x52f
k8s.io/kops/upup/pkg/fi/cloudup.(*ApplyClusterCmd).Run(0xc000868000, 0x43b59a0, 0xc000052108, 0x0, 0x0)
        /go/src/k8s.io/kops/upup/pkg/fi/cloudup/apply_cluster.go:938 +0x26c1
main.RunUpdateCluster(0x43b59a0, 0xc000052108, 0xc0003e1b60, 0x7ffca7e78ee0, 0x15, 0x4356d20, 0xc00000e018, 0xc000a41e60, 0x0, 0x0, ...)
        /go/src/k8s.io/kops/cmd/kops/update_cluster.go:274 +0x9ba
main.RunCreateCluster(0x43b59a0, 0xc000052108, 0xc0003e1b60, 0x4356d20, 0xc00000e018, 0xc00036fc00, 0xc0000e3800, 0xc000845d20)
        /go/src/k8s.io/kops/cmd/kops/create_cluster.go:1357 +0x3720
main.NewCmdCreateCluster.func1(0xc00088b400, 0xc0001fbc20, 0x0, 0x11)
        /go/src/k8s.io/kops/cmd/kops/create_cluster.go:274 +0x188
github.com/spf13/cobra.(*Command).execute(0xc00088b400, 0xc0001fbb00, 0x11, 0x12, 0xc00088b400, 0xc0001fbb00)
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:830 +0x2aa
github.com/spf13/cobra.(*Command).ExecuteC(0x61e3f80, 0x6220128, 0x0, 0x0)
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:914 +0x2fb
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:864
main.Execute()
        /go/src/k8s.io/kops/cmd/kops/root.go:96 +0x8f
main.main()
        /go/src/k8s.io/kops/cmd/kops/main.go:25 +0x20

What did you expect to happen?

Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.**

```apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2020-09-24T13:33:51Z"
name: k8s-cs.domain.net
spec:
api:
dns: {}
authorization:
rbac: {}
channel: stable
cloudProvider: aws
configBase: s3://domaininfrastructurestate/kops/cs/k8s-cs.domain.net
containerRuntime: docker
dnsZone: k8s-cs.domain.net
etcdClusters:

  • cpuRequest: 200m
    etcdMembers:

    • instanceGroup: master-us-west-2a-1

      name: "1"

    • instanceGroup: master-us-west-2a-2

      name: "2"

    • instanceGroup: master-us-west-2a-3

      name: "3"

      memoryRequest: 100Mi

      name: main

  • cpuRequest: 100m
    etcdMembers:

    • instanceGroup: master-us-west-2a-1

      name: "1"

    • instanceGroup: master-us-west-2a-2

      name: "2"

    • instanceGroup: master-us-west-2a-3

      name: "3"

      memoryRequest: 100Mi

      name: events

      iam:

      allowContainerRegistry: true

      legacy: false

      kubelet:

      anonymousAuth: false

      kubernetesApiAccess:

  • 0.0.0.0/0
    kubernetesVersion: 1.18.8
    masterPublicName: api.k8s-cs.domain.net
    networkCIDR: 172.20.0.0/16
    networking:
    kubenet: {}
    nonMasqueradeCIDR: 100.64.0.0/10
    sshAccess:
  • 0.0.0.0/0
    subnets:
  • cidr: 172.20.32.0/19
    name: us-west-2a
    type: Public
    zone: us-west-2a
    topology:
    dns:
    type: Public
    masters: public
    nodes: public

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-09-24T13:33:52Z"
labels:
kops.k8s.io/cluster: k8s-cs.domain.net
name: master-us-west-2a-1
spec:
image: ami-028e52edfeb33adb2
machineType: t3a.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-west-2a-1
role: Master
rootVolumeSize: 50
subnets:

  • us-west-2a

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-09-24T13:33:52Z"
labels:
kops.k8s.io/cluster: k8s-cs.domain.net
name: master-us-west-2a-2
spec:
image: ami-028e52edfeb33adb2
machineType: t3a.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-west-2a-2
role: Master
rootVolumeSize: 50
subnets:

  • us-west-2a

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-09-24T13:33:53Z"
labels:
kops.k8s.io/cluster: k8s-cs.domain.net
name: master-us-west-2a-3
spec:
image: ami-028e52edfeb33adb2
machineType: t3a.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-west-2a-3
role: Master
rootVolumeSize: 50
subnets:

  • us-west-2a

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2020-09-24T13:33:53Z"
labels:
kops.k8s.io/cluster: k8s-cs.domain.net
name: nodes
spec:
image: ami-028e52edfeb33adb2
machineType: t3a.small
maxSize: 3
minSize: 3
nodeLabels:
kops.k8s.io/instancegroup: nodes
role: Node
rootVolumeSize: 50
subnets:

  • us-west-2a
    tenancy: default
    ```
kinbug

Most helpful comment

OK @hakman congrats for your investigations and for having found the root cause.
Thank you for your help !

All 10 comments

Thanks for reporting this! This is definitely a bug in Kops. The problem is in the generated terraform code and the EBS volumes used by the etcd cluster. The terraform resource names for the volumes begin with the etcdMember name from the ClusterSpec.

Terraform 0.12 no longer allows resource names to begin with digits, but in your case you have etcdMember names of 1 2 and 3.

Kops could handle this in one of two ways:

  • if the etcdmember name starts with a digit, prefix the terraform resource name with something like vol
  • disallow etcdmember names to start with a digit. this gets messy because it would be a breaking change and enforcing it only for terraform users isnt possible (since the ClusterSpec is defined with a different kops command than the terraform generation)

We could also print a warning during generation that the terraform code will fail and the user needs to change the etcdmember names.

As a side note, I dont know what exactly is involved with changing the member names and how seamless that is. If its straight forward and without downtime, maybe thats what we suggest to terraform users.

/kind bug

As a side note, I dont know what exactly is involved with changing the member names and how seamless that is. If its straight forward and without downtime, maybe thats what we suggest to terraform users.

I had the same issue as @dimitrez with etcd starting with digits and i tried renaming it.
Straight forward is not the term i would use as you have etcd certificates name mismatch during the roll-out for example

I ended up purging masters volumes and restoring from etcd backup but we're testing if it could work by scaling up (adding new members with correct naming) and scaling down old masters

We also ran into this. Renaming the terraform resource would be less of a risk then trying to rename etcd members. I think the suggestion of "if the etcdmember name starts with a digit, prefix the terraform resource name with something like vol" is a good solution, and is inline with other 1.18 changes for terraform 12.

Is there a way to bypass this issue if I create a new cluster ?

@jtbonhomme if you're specifying etcd members with a letter (or prefix) in a brand new cluster, it will work

Like this :

    etcdMembers:
    - instanceGroup: master-eu-west-3a-1
      name: "a1"

@jtbonhomme do you modify in any way the cluster config after creating it?

@hakman no, I don't
@nfillot I am not sure to your point, I tried to export a cluster in yaml format (kops get --name sn-dev.k8s.local -o yaml > cluster.yaml), then I changed etcdMembers names to prefix them with a letter (the letter of the AZ), but how to i generate a terraform manifest from a cluster description file ?

@nfillot I tried to create a brand new cluster with a new name, then update it with --target terraform flag.
It works, thank you !

Thanks @jtbonhomme. I think I found the issue and should be fixed in the next 1.18 release.
This will work only if you are creating the cluster in multiple AZs, in case it helps.

OK @hakman congrats for your investigations and for having found the root cause.
Thank you for your help !

Was this page helpful?
0 / 5 - 0 ratings

Related issues

argusua picture argusua  路  5Comments

chrislovecnm picture chrislovecnm  路  3Comments

mikejoh picture mikejoh  路  3Comments

minasys picture minasys  路  3Comments

lnformer picture lnformer  路  3Comments