Kops: Seeing unable to read manifest path "/etc/kubernetes/manifests": path does not exist in kubelet logs

Created on 19 Jul 2018 · 11Comments · Source: kubernetes/kops

kops version:

Version 1.9.1 (git-ba77c9ca2)

k8s version:

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider: AWS

I have been seeing the following kubelet log for the last week now (~190k times):

E0719 12:47:11.180538    1287 file.go:76] Unable to read manifest path "/etc/kubernetes/manifests": path does not exist, ignoring

Wondering if there is a reason why kops doesn't set this directory up? Or is there something that I need to do?

lifecyclrotten

Source

sdanbury

Most helpful comment

After re-reading this: https://github.com/kubernetes/kops/blob/master/docs/boot-sequence.md
I am pretty sure it's because the directory is not created since kube-proxy is not being used on the nodes. Cross-checking the masters I cannot find such error log because the directory is created.

If I manually create this directory the error does go away. My question is which component is responsible for creating this directory. And if using kube-router does not require this directory, maybe we should remove the --pod-manifest-path=/etc/kubernetes/manifests from kubelet arguments on the nodes.

@justinsb @sdanbury

tianhe-oi on 14 Sep 2018

👍5

All 11 comments

That is unexpected... I presume this is on a node? But every node should have kube-proxy, located in that directory. Are you disabling kube-proxy or doing something else "out of the usual"?

I do agree that we should create the directory, but I'm trying to figure out why it wouldn't be created by the kube-proxy manifest.

Of course, when we move kube-proxy to a daemonset, we will have to create the directory :-)

justinsb on 13 Aug 2018

@justinsb We are seeing the same error messages. What I noticed is that we are not running kube-proxy since we chose to use kube-router. I wonder if that makes the difference?

tianhe-oi on 11 Sep 2018

Exactly the same case for us. We are running kube-router instead of kube-proxy.

coufalja on 14 Sep 2018

@justinsb apologies Justin, I completely missed this. Didn't mean to just raise it and run off.

Similarly to the others in this thread, I was also using kube-router at the time. However, I couldn't recreate it and haven't experienced it since. Maybe because I am not using kube-router anymore. Not sure.

Maybe @tianhe-oi or @CoufalJa could provide some further detail about when it occurs?

sdanbury on 14 Sep 2018

For us it happens just randomly, even during scale-ups triggered by cluster-autoscaler. Bad thing about it is that everything gets provisioned including kubelet and Pods got scheduled here but are unable to reach other pods (via ClusterIp services) in the cluster nor Pods from other nodes can reach those on the affected node. The workaround was to drain and Terminate the node (so it will get recreated). It almost looks like some kind of race-condition during bootstrap process of a new node. Happened to us just twice, I will try to dig more details if/when it happens again.

coufalja on 14 Sep 2018

👍1

@justinsb @sdanbury

tianhe-oi on 14 Sep 2018

👍5

Hello.
The log message disappear when you create the directory on the nodes ...
mkdir -p /etc/kubernetes/manifests

bragonznx on 24 Oct 2018

😄2 👀1

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 22 Jan 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 21 Feb 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 13 Apr 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.