Cluster-api: Status of a machine deployment

Created on 30 May 2019  Â·  23Comments  Â·  Source: kubernetes-sigs/cluster-api

/kind feature

Describe the solution you'd like
When I create a machine deployment, It is hard to see the state of it. If it is still resizing, or if it has reached the expected replica number, or it has running into some error reconciling. It would be nice to associated machine deployment with some sort of state such as "RESIZING", "READY", ERROR", etc. Also, currently it is not easy to get an error reason on why reconciling failed. can we add an errorReason filed to indicate why scaling failed?

Anything else you would like to add:
It will be similar to the request here to add status for cluster "https://github.com/kubernetes-sigs/cluster-api/issues/820"

help wanted kinfeature prioritbacklog

All 23 comments

A MachineDeployment is ready if spec.replicas == status.updatedReplicas == status.availableReplicas - does that sound accurate?

It's resizing if spec.replicas != status.updatedReplicas or spec.replicas != status.availableReplicas?

Can you use the 2 above calculations to determine ready & resizing?

It's in an error state if something is going wrong, such as:

  • error updating a MachineSet
  • error creating a MachineSet
  • error deleting old MachineSet (maybe not as bad, if it's scaled to 0)
  • error creating a Machine (when scaling up)
  • error deleting a Machine (when scaling down)

I think distinguishing between resizing-all-is-well and resizing-but-encountering-persistent-errors is the hard part. I'm not sure the best way to 1) detect this, and 2) store & display this information. Does anyone have any suggestions?

/priority important-soon

Note to self: check out how k/k Deployments handle resize failures & status

/cc @rudoi – you've spent some time staring at MachineDeployments. Want to join us?

We haven't done any work on this and if it's still needed, we can consider it for the next minor release.

/unassign
/milestone Next

@ncdc would it make sense to do something similar to the cluster phase status field, where the machine_deployment controller can update that field based on the replica numbers as you explained above and then show that in the kubectl get command as additional columns?

@nader-ziada we could consider doing something like that. Would you be willing to write up the details in a comment here?

I think the approach could be the following:

would need to investigate further to get more details, I could work on putting together a PR if the general approach makes sense. Thanks

This seems reasonable. Before you do a PR, would you be able to flesh out the details around how you calculate the phase, more specifically?

yeah sure, will update with a more detailed comment explaining how to calculate the phase. I'll do a quick spike to make sure what i'm saying makes sense, but will update soon

/assign

You can find an example in this proposal https://github.com/kubernetes-sigs/cluster-api/tree/master/docs/proposals on how we defined phases and each requirement

Thanks @vincepri

Add a new field in the MachineDeploymentStatus called Phase with the following values

  • Provisioning: if status.Replicas > status.ReadyReplicas
  • Running: if status.Replicas == status.ReadyReplicas
  • Failed: if the inspection of the machines list that is part of this deployment shows a machine with a phase of failed or unknown

Questions:

  • Resizing would be covered by Provisioning, do we need to differentiate between these two cases?
  • Do we need Pending when the machinedeployment is first created?

Would ScalingUp and ScalingDown be better than Provisioning?

I don't feel strongly on having Pending vs not having it.

I would prefer not to have Pending. Doesn’t really add much value.

We can do scaling up and down by also checking UpdatedReplicas. If Replicas < UpdatedReplicas then ScalingUp, otherwise ScalingDown

In addition to checking the ready replicas as well

Some of this is already handled by kubectl and kubectl describe for resources that have a scale subresource.

I'm not sure we need dedicated "states" or "phases" to show scaling up or down. I think the bigger concern is around bubbling up that there are anomolies encountered when scaling up or down.

My fear is that we introduce a field that is intended to just be used for friendly display and not meant to be used by external tooling for accurate "state", but ends up being used that way anyway.

The kubectl describe machinedeployment output will show all the replica information

      ...
Status:
  Observed Generation:   2
  Replicas:              2
  Unavailable Replicas:  2
  Updated Replicas:      2

but having a field with a status to say when the deployment is done scaling might be easier to automate against. Is there more context to why this would not be accurate?

@detiber we have status.phase for machines and clusters, and we have agreed these fields are for human consumption. People certainly could write automation against these fields (we can't stop that from happening), but we can augment our documentation on these fields to indicate they exist to provide a user-friendly visual status and nothing more. Couldn't we do the same thing here?

should I go ahead with this? or is still under consideration?

Let's give @detiber a chance to reply to my last comment, then we'll see 😄

@ncdc I'm willing to give it a go.

Great, thanks Jason!

Thanks all, I will work on that and hopefully have something to show soon

Was this page helpful?
0 / 5 - 0 ratings