Cluster-api: Add a ~Waiting phase for Machines to describe a Node isn't ready

Created on 25 Oct 2019  路  12Comments  路  Source: kubernetes-sigs/cluster-api

User Story

As a operator I would like to know if a Node linked to a Machine is not ready after it has been provisioned

Detailed Description

Follow-up issue to https://github.com/kubernetes-sigs/cluster-api/issues/1622, the Machine controller should check during reconciliation if the Node specified in NodeRef isn't in a Ready state and flip the Status.Phase field to a Waiting phase.

/kind feature
/priority important-longterm
/milestone v0.3.0

kinfeature prioritimportant-longterm

All 12 comments

Bike-shedding on the name: I think something like NotReady, Unavailable, Offline is more indicative of the status than Waiting is.

Updated the title to make it clear name is something we need to figure out, I'm ok with anything meaningful :)

Maybe clarify that it's after the machine has successfully bootstrapped (instead of just provisioned)?

NodeNotReady?

This is potentially a stopgap until we have a better proposal around status conditions. In the meantime, looking for feedback!

Tagging a bunch of people for 馃憖. Would 鉂わ笍 comments, and feel free to tag more people (or ignore, that's ok too! 馃槃).

@CecileRobertMichon @juan-lee @justinsb @akutz @michaelgugino @pablochacin @detiber @liztio @chuckha @noamran @amy @rudoi @rsmitty @andrewsykim @moshloop

Status.Phase: chillin'

but seriously, offline was the one I liked the most and could not think of a more appropriate, succinct name. I was thinking about something like UnavailableButRecoverable...but that's terrible 馃槵

What would we consider Ready here? Would we consider the Node being fully Ready (whjich would require CNI to be deployed)? Would we want to exclude that since it is out of scope for our purposes? Would we want to be able to distinguish between the two and introduce 2 new phases?

I would prefer to have only one state that covers both, if possible. Once we settle on how to shape conditions, that would cover others as well

I would prefer to have only one state that covers both, if possible. Once we settle on how to shape conditions, that would cover others as well

I make a distinction between the two because whatever is used to determine AlmostReady (defined as ready, just missing CNI to be fully Ready), would be similar to what we'd want to use for health checks related to the Node resource when it comes to initializing, scaling, or upgrading of a control plane machine. But this is less useful to end users who would be more interested in the full Ready state of a Node.

I'm generally against syncing status from node to machines. Machines are a mechanism for creating nodes, not much else.

Once a machine has a nodeRef, it's done. What happens from there is the node's problem. It doesn't make sense to re-invent node statuses or describe those conditions on a machine.

If for some reason you feel you absolutely have to indicate node status on a machine, then I would just do a blind copy of node conditions, and not invent any new ones.

Closing this in favor of #1658
/close

@ncdc: Closing this issue.

In response to this:

Closing this in favor of #1658
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings