I had to nuke my single-master self-administered cluster because a node accidentally power cycled and I couldn't restore the API server.
Clearly this is understood as an issue in a one-master self-hosted design, but we should still persist the static YAML files somewhere for easier recovery in such situations.
(alternately, enabling kubeadm to take over an existing etcd install would also likely work...)
--brendan
(alternately, enabling kubeadm to take over an existing etcd install would also likely work...)
You should be able to use an external etcd but it is not documented anywhere as far as my search-fu could get me. Also, fwiw, folks have been playing around with the etcd operator and kubeadm; I think the operator is able to take automated backups of etcd but I haven't actually used it yet to know for sure.
cc聽@timothysc that did the recovery work in v1.9 for this, we can now actually enable it in v1.10 to make the reboot case work ^
I've been thinking about doing the similar thing as kubectl apply does... store the original manifest as a JSON blob in an annotation, but I don't know whether that's a good or bad idea in this case. It'd help in some situations. I realize this question is about storing them on the filesystem as well though, we maybe wanna do that as well, e.g. /etc/kubernetes/manifests/old, but idk
Annotations wouldn't work for my cluster because the trouble was there's nothing to even make the apiserver startup, since the kubelet has nothing to talk to...
Yeah, I know, hence we did https://github.com/kubernetes/community/blob/master/contributors/design-proposals/cluster-lifecycle/draft-20171020-bootstrap-checkpointing.md which is my first note above.
The second point on the annotation would be useful in other context when roundtripping from Static Pods to self-hosted and back.
The third point would probably address the point raised in this issue with "backup" manifests on disc somewhere...
I have a plan I need to writeup that's cleaner then checkpoints to have a pivoter pod that always runs to rebootstrap.
As a SIG we have decided to reduce the code inside of kubeadm to eliminate the self-hosting option and potentially have a separate tool that allows for pivoting.
ClusterAPI is also the preferred means of performing rolling updates.
Most helpful comment
I have a plan I need to writeup that's cleaner then checkpoints to have a pivoter pod that always runs to rebootstrap.