_From @rsdcastro on February 28, 2018 20:52_
This issue tracks how fault zones (or availability zones) should be represented in the Cluster API. Should it be a top-level field? Should we be able to specify provider specific config for them (separately from the current provider config)? This is both related to master and nodes that might have some affinity to resources specific to that zone.
@krousey @maisem @justinsb
_Copied from original issue: kubernetes/kube-deploy#627_
And while not explicit above, this could also have some pod scheduling implications that need to be considered.
_From @jessicaochen on February 28, 2018 22:16_
Something to keep in mind is that fault zone concepts even exist on-prem on bare metal. For example, I would assume even if one owns a datacenter, one would want workloads to be spread across racks and across networking hardware etc. so that no one failure will take the whole workload out. However fault zones are represented, it would be good if it allows to express multiple types of fault zones to spread across.
_From @jhorwit2 on March 2, 2018 19:16_
However fault zones are represented, it would be good if it allows to express multiple types of fault zones to spread across.
@jessicaochen The scheduler spread predicate by default only takes into account failure-domain.beta.kubernetes.io/region and failure-domain.beta.kubernetes.io/zone so supporting more than that out of the box would require you to do custom stuff with affinity/anti-affinity on the pod itself. We use those two labels for our on-prem clusters as well.
That also matches the pattern other core resources, like the cloud provider interface, do as well. They only support zone & region.
_From @roberthbailey on March 6, 2018 21:57_
Strawman: Leave it inside the existing provider config for now. See if common patterns emerge that would cause us to promote it it a top level field.
Looking at GCP (should be the same for any cloud), you will already specify how you want machines spread across availability zones by putting the zone for each machine (or set) in the provider config. This works for masters just as well as workers.
What is the advantage of duplicating the data at the top level (and then needing to ensure it stays in sync)? What environment-independent form do you imagine that this would take? As @jessicaochen points out, on prem you would likely want to specify failure domains based on racks (TOR redundancy), or power domains, or maybe "clusters" in a datacenter, whereas on clouds you are really only given the option for the "cluster" level selection in the APIs.
_From @krousey on March 6, 2018 22:4_
@roberthbailey Just to clarify your strawman, you're saying that if I want 12 nodes across 3 availability zones I should have 3 machine sets of count 4 with the proper zone in each template's provider config?
_From @roberthbailey on March 6, 2018 22:14_
Yes. I think that way makes the most sense. I like having it explicitly in the machineset / deployment instead of having a hidden internal multiplier that you have to change indirectly by tweaking availability zones in the machine set.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
/area api
cc @juan-lee @cecilerobertmichon
@timothysc @detiber @ncdc are you aware of anyone working on this?
I'm not aware of any activity
I have a draft proposal for this. Will share before Wednesday next week.
As so much is in flux with the data model and implementation, I am choosing to define an approach without taking any stance on specific data model changes. Naturally that means the proposal is incomplete until we expose the knob(s) for users to tune.
Still, I think I've managed to attain a nugget of value in sketching out the story. Will update back with the proposal.
@ncdc @vincepri @detiber collected some of my thoughts, looking for a gut check
/assign @alexeldeib @timothysc
We may not get to this in v1alpha2 but we'll try to make a decision during this timeframe.
/milestone Next
/unassign
There is a tentative proposal for how to do this wrt control plane management here: https://github.com/kubernetes-sigs/cluster-api/issues/1647
Should we close this one in favor of the proposal?
@vincepri I don't think so, this one is about deciding how to handle it, and the proposal is a concrete proposal. If we accept the proposal, then I think that would qualify as meeting the requirements to close this issue.
AFAIK, the proposal is in the milestone and waiting to be implemented, there wasn't much push back on it
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/close
closing, since we accepted the above mentioned failure domain proposal
@detiber: Closing this issue.
In response to this:
/close
closing, since we accepted the above mentioned failure domain proposal
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.