Enhancements: Ability to create dynamic HA clusters with kubeadm

Created on 26 Jul 2017  ·  68Comments  ·  Source: kubernetes/enhancements

Feature Description


Edit by @spiffxp:

Per https://github.com/kubernetes/enhancements/issues/357#issuecomment-460658341 we should consider kicking this out of of v1.14 and using a separate tracking issue for the actual KEP being implemented this cycle: 20190122-Certificates-copy-for-kubeadm-join--control-plane.md

kinfeature sicluster-lifecycle stagbeta trackeno

Most helpful comment

@claurence our code work has finished for this feature for 1.14.
bug fixes only at this point.

And kubernetes/kubeadm#1373 isn't a must have in 1.14

it's a tracking issue with nice to haves at this point, will continue for 1.15.

docs PRs for HA are tracked here
https://github.com/kubernetes/kubeadm/issues/1422

overall state of the feature: still ALPHA.

All 68 comments

Hoping to get an alpha implementation of this in v1.8. Might or might not work out. At least a ready design doc should be ready at the end of the v1.8 cycle.

@kubernetes/sig-cluster-lifecycle-feature-requests I moved this to next-milestone as no code for this is shipping in v1.8, only design

Targeting alpha in v1.9

@luxas :wave: Please indicate in the 1.9 feature tracking board
whether this feature needs documentation. If yes, please open a PR and add a link to the tracking spreadsheet. Thanks in advance!

This feature didn't make it into v1.9

@luxas @kubernetes/sig-cluster-lifecycle-feature-requests still on track for 1.10?

This has been de-prioritized for 1.10 in favor of trying to get the existing kubeadm functionality promoted from beta to GA.

@roberthbailey thanks

@luxas / @timothysc -- Any plans for this in 1.11?

If so, can you please ensure the feature is up-to-date with the appropriate:

  • Description
  • Milestone
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

cc @idvoretskyi

@justaugustus No plans to move feature state this release, incremental improvements.

/lifecycle frozen

/remove-lifecycle frozen

This feature current has no milestone, so we'd like to check in and see if there are any plans for this in Kubernetes 1.12.

If so, please ensure that this issue is up-to-date with ALL of the following information:

  • One-line feature description (can be used as a release note):
  • Primary contact (assignee):
  • Responsible SIGs:
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):

    • Alpha release target (x.y)

    • Beta release target (x.y)

    • Stable release target (x.y)

Set the following:

  • Description
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

Once this feature is appropriately updated, please explicitly ping @justaugustus, @kacole2, @robertsandoval, @rajendar38 to note that it is ready to be included in the Features Tracking Spreadsheet for Kubernetes 1.12.


Please note that Features Freeze is tomorrow, July 31st, after which any incomplete Feature issues will require an Exception request to be accepted into the milestone.

In addition, please be aware of the following relevant deadlines:

  • Docs deadline (open placeholder PRs): 8/21
  • Test case freeze: 8/28

Please make sure all PRs for features have relevant release notes included as well.

Happy shipping!

P.S. This was sent via automation

Hi
This enhancement has been tracked before, so we'd like to check in and see if there are any plans for this to graduate stages in Kubernetes 1.13. This release is targeted to be more ‘stable’ and will have an aggressive timeline. Please only include this enhancement if there is a high level of confidence it will meet the following deadlines:

  • Docs (open placeholder PRs): 11/8
  • Code Slush: 11/9
  • Code Freeze Begins: 11/15
  • Docs Complete and Reviewed: 11/27

Please take a moment to update the milestones on your original post for future tracking and ping @kacole2 if it needs to be included in the 1.13 Enhancements Tracking Sheet

Thanks!

@fabriziopandini @timothysc
WDYT now that there is --join experimental-control-plane?
could we promote this to beta in 1.13?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale
We are currently working on this with the objective to graduate join --control-plane workflow to beta in v1.14

@luxas @timothysc Hello - I’m the enhancement’s lead for 1.14 and I’m checking in on this issue to see what work (if any) is being planned for the 1.14 release. Enhancements freeze is Jan 29th and I want to remind that all enhancements must have a KEP

@claurence
as mentioned one message above we are targeting beta for this feature in this cycle:

our KEP:
https://github.com/kubernetes/enhancements/pull/681

a tracking issue in k/kubeadm:
https://github.com/kubernetes/kubeadm/issues/1200

@fabriziopandini can elaborate further.

Now that we're all in on KEPs and we have tracking KEPs for some of this work, I'd move to close this issue.

we had this discussion where the release team joined the sig-cluster-lifecycle meeting and the overall impression was that KEPs and issues should exist together.

my vote is to track progress using KEPs only.
problem with KEPs is that there is no way to filter them by labels like tracked, so this will make the release team job difficult.

cc @BenTheElder @spiffxp

For now, I'd rather not have some issues open and some closed, and I don't think we're at a point where every tracking issue can be closed and we go solely off of KEPs.

I'm fine with all updates landing in the KEP. We'll get to there from here. Just leave this open for us, so we can have a list of everything in/out tracked/not. There are many more KEPs than there are enhancements planned to land in 1.14

@neolit123 - the KEP PR is see is not merged yet - do you all anticipate it will be merged before enhancements freeze (EOD today, 1/29)

@luxas @neolit123 since the KEP for this issue hasn't been merged yet we will be removing it from the 1.14 milestone. To have it added back in please file an exception - information on the exception process can be found here: https://github.com/kubernetes/sig-release/blob/master/releases/EXCEPTIONS.md

@claurence i'm sorry for the delayed reply.
my understanding is that this is the KEP we are missing:
https://github.com/kubernetes/enhancements/pull/681

the KEP is maintained by @fabriziopandini (ping).

@neolit123 @claurence #681 is the umbrella KEP for kubeadm HA, but the part of this KEP in scope for v1.14 was detailed into https://github.com/kubernetes/enhancements/pull/713, which merged before deadline.

Please add back tracked/yes label

/milestone v1.14
I see https://github.com/kubernetes/enhancements/pull/804 and would merge if I could fix the typo myself

I leave the /tracked label to @claurence

It's a bit confusing to me because this enhancement talks about HA clusters, but the two keps I see seem to talk about specifics, not the overall plan for HA clusters:

So, I can tell you're landing work, I think it's just confusing whether it's one or both KEPs that are involved here

@spiffxp
If the goal is v1.14 enhancement tracking only, you can focus only on Certificates-copy-for-kubeadm-join--control-plane (and leave out of the radar the umbrella KEP)

Some more context:

KEP0015 is the umbrella KEP for HA in kubeadm (this effort started in 1.12)

in v1.14 we are addressing a problem that was stated in the umbrella KEP, but due to fact that its solution required some careful design around security, it was detailed in another KEP (Certificates-copy-for-kubeadm-join--control-plane)

in v1.14 we are addressing a problem that was stated in the umbrella KEP, but due to fact that its solution required some careful design around security, it was detailed in another KEP (Certificates-copy-for-kubeadm-join--control-plane)

i can confirm this.

https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/20190122-Certificates-copy-for-kubeadm-join--control-plane.md
is a separate KEP related to this tracking issue and it's for this cycle.

so we have a bit of a sub-KEP process going on here.
once the Certificates-copy-for-kubeadm-join--control-plane.md work is done we should add what happened under the Implementation history for 015.

Hi @fabianofranz! I'm one of the v1.14 docs release shadows.

Does this enhancement require any new docs (or modifications)?

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.14) due by Friday, March 1. It would be great if it's the start of the full documentation, but even a placeholder PR is acceptable. Let me know if you have any questions!

hi, @cody-clark
we just have to add a couple of new CLI flags in the kubeadm reference docs for this cycle:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/

the feature does not need any major documentation changes. also unlike the OP claims our support will still remain in alpha due to the inability to set a good HA e2e testing setup for this cycle.

so given the support is alpha we are also reserved in documenting in more detail for now.
cc @fabriziopandini

Hello, 1.14 enhancement shadow here. Code Freeze is March 7th and all PRs must be merged by then to your issue to make the 1.14 release. What open K/K PRs do you still have that need to merge? Thanks

our tracking issue for this work in the k/kubeadm repo is here: https://github.com/kubernetes/kubeadm/issues/1373

i will let you know if we are behind schedule.

thanks @neolit123, the only k/k PR open i found in your tracking issue is: kubernetes/kubernetes#72886, correct ? TY

sorry i didn't clarify @lledru

the only critical remaining item in the whole list for this cycle is * Alllow user to provide the certificate key on upload-certs:
https://github.com/kubernetes/kubeadm/issues/1408 under Certificates copy workflow

the rest are nice-to-haves for this cycle, but most importantly - due to the fact we are not going to be able to provide e2e signal for the new feature, we are still keeping in ALPHA state for 1.14.

the first post needs to be changed, as the state for this feature is still ALPHA:
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#kubeadm-maturity

@neolit123 that issue lined is closed - does that mean all 1.14 related issues and PRs are merged for this? And https://github.com/kubernetes/kubeadm/issues/1373 isn't a must have in 1.14

@claurence our code work has finished for this feature for 1.14.
bug fixes only at this point.

And kubernetes/kubeadm#1373 isn't a must have in 1.14

it's a tracking issue with nice to haves at this point, will continue for 1.15.

docs PRs for HA are tracked here
https://github.com/kubernetes/kubeadm/issues/1422

overall state of the feature: still ALPHA.

Hello @neolit123 , I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

@kacole2 we are targeting BETA.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

will do. sadly my proposal to SIG Release of how to do this better is collecting dust.

/stage beta
/milestone v1.15

https://github.com/kubernetes/enhancements/pull/973 is updating the KEPs detailing this effort with v1.14 achievements and v1.15 goals:

Hey, @timothysc @luxas 👋 I'm the v1.15 docs Lead.
Does this enhancement require any new docs (or modifications)?

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.15) due by Thursday, May 30th. It would be great if it's the start of the full documentation, but even a placeholder PR is acceptable. Let me know if you have any questions

@MAKOSCAFEE
hi, the work in 1.15 will not require doc changes.

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.15) due by Thursday, May 30th.

thanks for the heads up!

Thank you @neolit123 for the information.

Hi @neolit123 @luxas @timothysc. Code Freeze is Thursday, May 30th 2019 @ EOD PST. All enhancements going into the release must be code-complete, including tests, and have docs PRs open.

Please list all current k/k PRs so they can be tracked going into freeze. If the PRs aren't merged by freeze, this feature will slip for the 1.15 release cycle. Only release-blocking issues and PRs will be allowed in the milestone.

If you know this will slip, please reply back and let us know. Thanks!

@kacole2 thanks for the head up. We are still confident to get this into the release

There are also PRs in flights on test-infra for setting up the planned E2E jobs

Hi i@neolit123 @luxas @timothysc @fabriziopandini , I'm the 1.16 Enhancement Lead. Is this feature going to be graduating alpha/beta/stable stages in 1.16? Please let me know so it can be added to the 1.16 Tracking Spreadsheet. If not's graduating, I will remove it from the milestone and change the tracked label.

Once coding begins or if it already has, please list all relevant k/k PRs in this issue so they can be tracked properly.

Milestone dates are Enhancement Freeze 7/30 and Code Freeze 8/29.

Thank you.

hi @kacole2 . with kubeadm-HA graduating to beta in 1.15, we decided to give this work a rest and focus on some other priorities for 1.16.
thanks.

@fabriziopandini please comment if you have something else in mind.

+1 let's wait for users feedbacks before taking further steps

Hey there @fabriziopandini @luxas @timothysc -- 1.17 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to alpha/beta/stable in 1.17?

The current release schedule is:

  • Monday, September 23 - Release Cycle Begins
  • Tuesday, October 15, EOD PST - Enhancements Freeze
  • Thursday, November 14, EOD PST - Code Freeze
  • Tuesday, November 19 - Docs must be completed and reviewed
  • Monday, December 9 - Kubernetes 1.17.0 Released

If you do, I'll add it to the 1.17 tracking sheet (https://bit.ly/k8s117-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

Thanks!

@fabriziopandini should correct me if needed but i think we are not going to graduate this to GA in 1.17.

Thanks @neolit123. I'll wait for @fabriziopandini to confirm. Also let us know if there are any major changes that will occur during this release and we'll track that.

Thanks!

@jeremyrickard, I confirm that we are NOT going to graduate this to GA in the 1.17 cycle

Thanks for the update @fabriziopandini. I'll remove it from the tracking sheet!

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

As per discussion in kubeadm office hours, there are still some work to do before GA:

  1. improve management of cluster status in the kubeadm config map
  2. Use etcd learner mode
  3. Get E2E signal on parallel join

Only the first point will be addressed in 1.18, so also for this cycle we will keep this feature in beta

Hey there @fabriziopandini -- 1.18 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to alpha|beta|stable in 1.18?
The current release schedule is:
Monday, January 6th - Release Cycle Begins
Tuesday, January 28th EOD PST - Enhancements Freeze
Thursday, March 5th, EOD PST - Code Freeze
Monday, March 16th - Docs must be completed and reviewed
Tuesday, March 24th - Kubernetes 1.18.0 Released
To be included in the release, this enhancement must have a merged KEP in the implementable status. The KEP must also have graduation criteria and a Test Plan defined.
If you would like to include this enhancement, once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍
We'll be tracking enhancements here: http://bit.ly/k8s-1-18-enhancements
Thanks!

@kikisdeliveryservice ,

as @fabriziopandini pointed out:

Only the first point will be addressed in 1.18, so also for this cycle we will keep this feature in beta

the feature will remain BETA in 1.18.

@neolit123 thank you for confirming, appreciate your response! :)

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Hi @fabriziopandini @neolit123 ,

1.19 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating in 1.19?

In order to have this part of the release:

  1. The KEP PR must be merged in an implementable state
  2. The KEP must have test plans
  3. The KEP must have graduation criteria.

The current release schedule is:

  • Monday, April 13: Week 1 - Release cycle begins
  • Tuesday, May 19: Week 6 - Enhancements Freeze
  • Thursday, June 25: Week 11 - Code Freeze
  • Thursday, July 9: Week 14 - Docs must be completed and reviewed
  • Tuesday, August 4: Week 17 - Kubernetes v1.19.0 released

Please let me know and I'll add it to the 1.19 tracking sheet (http://bit.ly/k8s-1-19-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

Thanks!

hi, we are not going to work on this for 1.19.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Hi @neolit123

Enhancements Lead here. Any plans for this to graduate in 1.20?

Thanks,
Kirsten

Hi, we haven't decided yet, but more likely not for 1.20.

https://github.com/kubernetes/kubeadm/issues/2081

Thanks, I will keep this as untracked, please let us know if anything changes so we can properly tracks & milestone this KEP.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

liggitt picture liggitt  ·  7Comments

AndiLi99 picture AndiLi99  ·  13Comments

prameshj picture prameshj  ·  9Comments

justaugustus picture justaugustus  ·  7Comments

mitar picture mitar  ·  8Comments