Kops: Placeholder IP not getting resolved(Spotinst - Kops)

Created on 2 Jan 2018 · 20Comments · Source: kubernetes/kops

Following up with this guide " [http://blog.spotinst.com/2017/08/17/elastigroup-kubernetes-operations-kops/]

I'm setting up a new cluster using spotinst provider but not able to connect to the api server ,
commands are failing with what looks like the common "placeholder IP" address error.
`$kops validate cluster

cannot get nodes for "example.xyz.co": Get https://api.example.xyz.co/api/v1/nodes: net/http: TLS handshake timeout

$ kubectl get nodes
Unable to connect to the server: net/http: TLS handshake timeout`

I've checked that my 4 DNS NS entries matches the 4 entries provided by my Route53 Hosted Zone.
It's been several hours, so DNS should have replicated.

Again, I can see the 3 nodes (Running) in the EC2 console (nodes.MY_CLUSTER), which correspond to my config.

lifecyclrotten

Source

vidurgupta

Most helpful comment

Am getting the below error while executing kops validate cluster command. Also public IP is not updating in Route53. How to fix this?

VALIDATION ERRORS
KIND NAME MESSAGE
dns apiserver Validation Failed

The dns-controller Kubernetes deployment has not updated the Kubernetes cluster's API DNS entry to the correct IP address. The API DNS IP address is the placeholder address that kops creates: 203.0.113.123. Please wait about 5-10 minutes for a master to start, dns-controller to launch, and DNS to propagate. The protokube container and dns-controller deployment logs may contain more diagnostic information. Etcd and the API DNS entries must be updated for a kops Kubernetes cluster to start.

Validation Failed

ghost on 30 Apr 2018

👍7

All 20 comments

Are the DNS entries mapping to local IPs?

huang-jy on 2 Jan 2018

@huang-jy no all the DNS entries are also mapped to placeholder ip i.e. 203.0.113.123

vidurgupta on 3 Jan 2018

@vidurgupta all four entries are stuck at placeholder IP? If so, then you should check on the worker node for errors in the logs. Also, check to make sure the worker nodes can talk to the master nodes. (ssh onto worker and attempt to curl to the master for starters)

Since you're using AWS like I am, also check which zones your master and workers are in. In my case, I had master in A and worker in B and they couldn't talk to each other because I had no route configured to allow the two subnets to talk.

huang-jy on 3 Jan 2018

@huang-jy Yeah, all four entries are stuck at placeholder ip. I'm able to reach my master node from worker node. my one worker node is in west-1a and another in 1c, master is also in 1c.

There are no logs in kube proxy log in both worker nodes.

vidurgupta on 3 Jan 2018

@vidurgupta there should be logs within the kube-proxy log on the worker even if not connectable.

Can you curl from the worker to the master?

e.g. curl -IL http://api.{clustername} ?

huang-jy on 3 Jan 2018

@huang-jy There are no logs in kube proxy , i think kops skipped the part where it load a docker image for kube proxy in worker node.

I'm not able curl the api server from the worker node.

These are the logs at time of creating the cluster

`I0103 08:08:18.092416 25915 create_cluster.go:884] Using SSH public key: /home/vidur/.ssh/id_rsa.pub
I0103 08:08:25.462181 25915 subnets.go:183] Assigned CIDR 172.20.32.0/19 to subnet us-west-1a
I0103 08:08:25.462241 25915 subnets.go:183] Assigned CIDR 172.20.64.0/19 to subnet us-west-1c
W0103 08:08:53.942879 25915 urls.go:66] Using nodeup location from NODEUP_URL env var: "http://spotinst-public.s3.amazonaws.com/integrations/kubernetes/kops/v1.8.0-alpha.1/nodeup/linux/amd64/nodeup"
I0103 08:09:09.202299 25915 template_functions.go:150] watch-ingress=false set on DNSController
I0103 08:09:09.860207 25915 template_functions.go:150] watch-ingress=false set on DNSController
I0103 08:09:16.226512 25915 executor.go:91] Tasks: 0 done / 65 total; 36 can run
I0103 08:09:16.398369 25915 logging_retryer.go:59] Retryable error (RequestError: send request failed
caused by: Post https://ec2.us-west-1.amazonaws.com/: EOF) from ec2/DescribeDhcpOptions - will retry after delay of 59ms
I0103 08:09:21.397388 25915 vfs_castore.go:409] Issuing new certificate: "kube-proxy"
I0103 08:09:21.604885 25915 vfs_castore.go:409] Issuing new certificate: "kops"
I0103 08:09:21.844074 25915 vfs_castore.go:409] Issuing new certificate: "kubecfg"
I0103 08:09:21.876597 25915 vfs_castore.go:409] Issuing new certificate: "master"
I0103 08:09:22.101851 25915 vfs_castore.go:409] Issuing new certificate: "kubelet"
I0103 08:09:22.164208 25915 vfs_castore.go:409] Issuing new certificate: "apiserver-proxy-client"
I0103 08:09:22.180897 25915 vfs_castore.go:409] Issuing new certificate: "kube-scheduler"
I0103 08:09:22.446200 25915 vfs_castore.go:409] Issuing new certificate: "kubelet-api"
I0103 08:09:22.491788 25915 vfs_castore.go:409] Issuing new certificate: "kube-controller-manager"
I0103 08:09:31.840400 25915 executor.go:91] Tasks: 36 done / 65 total; 13 can run
I0103 08:09:38.413406 25915 executor.go:91] Tasks: 49 done / 65 total; 16 can run
W0103 08:09:44.653789 25915 urls.go:87] Using protokube location from PROTOKUBE_IMAGE env var: "http://spotinst-public.s3.amazonaws.com/integrations/kubernetes/kops/v1.8.0-alpha.1/protokube/images/protokube.tar.gz"
I0103 08:10:12.681337 25915 executor.go:91] Tasks: 65 done / 65 total; 0 can run
I0103 08:10:12.681364 25915 dns.go:152] Pre-creating DNS records
I0103 08:10:23.298580 25915 update_cluster.go:249] Exporting kubecfg for cluster
Kops has set your kubectl context to kubernetes.example.co

Cluster is starting. It should be ready in a few minutes.

Suggestions:

validate cluster: kops validate cluster
list nodes: kubectl get nodes --show-labels
ssh to the master: ssh -i ~/.ssh/id_rsa [email protected]
The admin user is specific to Debian. If not using Debian please use the appropriate user based on your OS.
read about installing addons: https://github.com/kubernetes/kops/blob/master/docs/addons.md
`

vidurgupta on 3 Jan 2018

@huang-jy This is the command I'm using for creating the cluster

kops create cluster \ --name $KOPS_CLUSTER_NAME \ --zones $KOPS_CLUSTER_ZONES \ --cloud $KOPS_CLOUD_PROVIDER \ --node-size $KOPS_NODE_SIZE \ --master-size $KOPS_MASTER_SIZE \ --master-volume-size $KOPS_MASTER_VOLUME_SIZE \ --node-volume-size $KOPS_NODE_VOLUME_SIZE \ --spotinst-cloud-provider $SPOTINST_CLOUD_PROVIDER \ --kubernetes-version $KUBERNETES_VERSION \ --yes

vidurgupta on 3 Jan 2018

@vidurgupta --spotinst-cloud-provider? Which version of kops are you using? (I'm on 1.8.0)

Also I'm guessing you're intending to use spotInstances?

huang-jy on 3 Jan 2018

@huang-jy I'm using Version 1.7.1-beta.2 (git-1719d50)
This is a special binary of kops in integration with spotinst service, you can get it from this link .

vidurgupta on 3 Jan 2018

@vidurgupta thanks for that, that's something I wasn't aware of. I've not used the spotinst provider so I can't really say what might be causing that problem.

huang-jy on 3 Jan 2018

@huang-jy thanks for all the help. Could you tag the right person for the problem ?

vidurgupta on 3 Jan 2018

@vidurgupta I am not sure who would be the best person, I'm afraid. But if you did find the answer, I'm keen to know too

huang-jy on 3 Jan 2018

@vidurgupta I believe you should create an issue at the fork repository because spotinst provider isn't supported by the official version yet: https://github.com/spotinst/kubernetes-kops

spa-87 on 22 Jan 2018

👍1

@spa-87 The issues are not enabled on the forked repo. I will try to get some other tool .

vidurgupta on 22 Jan 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 22 Apr 2018

Am getting the below error while executing kops validate cluster command. Also public IP is not updating in Route53. How to fix this?

VALIDATION ERRORS
KIND NAME MESSAGE
dns apiserver Validation Failed

Validation Failed

ghost on 30 Apr 2018

👍7

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot on 31 May 2018

I am using kops 1.9.1 and I am running in a similar issue when specifying networking=amazon-vpc-routed-eni. In this case all placeholder IPs will be updated correctly, but the public IP for the Kubernetes master. This is reproducible.

But if I specify another networking option like calico or flannel-vxlan, the installation works as expected.

xvzup on 12 Jun 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 12 Jul 2018

Am getting the below error while executing kops validate cluster command. Also public IP is not updating in Route53. How to fix this?

VALIDATION ERRORS
KIND NAME MESSAGE
dns apiserver Validation Failed

The dns-controller Kubernetes deployment has not updated the Kubernetes cluster's API DNS entry to the correct IP address. The API DNS IP address is the placeholder address that kops creates: 203.0.113.123. Please wait about 5-10 minutes for a master to start, dns-controller to launch, and DNS to propagate. The protokube container and dns-controller deployment logs may contain more diagnostic information. Etcd and the API DNS entries must be updated for a kops Kubernetes cluster to start.

Validation Failed

I've got the same message after kops validate cluster, but waited for about 5 minutes and got master and nodes Ready (True).