Enhancements: Cluster Autoscaler / Cluster API Integration

Created on 1 Aug 2018  路  43Comments  路  Source: kubernetes/enhancements

Feature Description

  • One-line feature description (can be used as a release note): Convert the cluster autoscaler to make use of the cluster API for controlling node creation/deletion.
  • Primary contact (assignee): @enxebre
  • Responsible SIGs: SIG Autoscaling
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:

    • @mwielgus

    • @MaciekPytel

    • [TBD Cluster API]

  • Approver (likely from SIG/area to which feature belongs):

    • @mwielgus

    • [TBD Cluster API]

  • Feature target (which target equals to which milestone):

    • Alpha release target (x.y)

    • Beta release target (x.y)

    • Stable release target (x.y)

kinfeature siautoscaling stagalpha trackeno

Most helpful comment

/remove-lifecycle stale

Cluster autoscaler and the cluster api integration still needs to be completed. It seems like this was initially looked at with a sponsor but then dropped off. Maybe this needs a bit more planning since each release it has been skipped over.

All 43 comments

@kubernetes/sig-autoscaling-feature-requests it's not clear to me whether or not we need to precisely track this with the kubernetes release cluster autoscaler doesn't always release at the exact same cadence as Kubernetes proper (EDIT: @MaciekPytel correctly pointed out that the releases match -- I think I was confused out some minor releases), but I wanted to get this filed just in case, and so that we can easily track it.

cc @derekwaynecarr

See also https://github.com/kubernetes/autoscaler/releases

I'm ultimately willing to sponsor this, but I'd like to get exact names from the CA subproject for reviewers/approvers (we discussed a bit in the past meeting and there was approval to begin looking in this direction).

Actually, Cluster Autoscaler minor releases match Kubernetes releases 1-1. Cluster Autoscaler 1.4 will go out with k8s 1.12, 1.5 with 1.13, etc.

From Cluster Autoscaler side the approver should probably be @mwielgus and reviewers @mwielgus and me.

We probably need an approver/reviewer from Cluster API side as well. As discussed on the sig meeting integration with CA would require changes to Cluster API (most importantly ability to delete a specific node, as opposed to just resizing machineset; more changes would be required for additional features like scale-to-0, but those are not critical for initial implementation).

Thanks for the update. I've added this to the 1.12 tracking sheet.

/assign @enxebre @DirectXMan12
/stage alpha
cc: @kacole2 @wadadli @robertsandoval @rajendar38

@justaugustus: GitHub didn't allow me to assign the following users: enxebre.

Note that only kubernetes members and repo collaborators can be assigned.
For more information please see the contributor guide

In response to this:

Thanks for the update. I've added this to the 1.12 tracking sheet.

/assign @enxebre @DirectXMan12
/stage alpha
cc: @kacole2 @wadadli @robertsandoval @rajendar38

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Hey there! @DirectXMan12 I'm the wrangler for the Docs this release. Is there any chance I could have you open up a docs PR against the release-1.12 branch as a placeholder? That gives us more confidence in the feature shipping in this release and gives me something to work with when we start doing reviews/edits. Thanks! If this feature does not require docs, could you please update the features tracking spreadsheet to reflect it?

@enxebre is the right person to ask, as the primary contact

Thanks Solly! @enxebre Could you let me know what the docs status is?

The feature is still under design/development and will have to track post 1.12.

@derekwaynecarr -- thanks for the update. Pulling this from the 1.12 milestone.

WIP proposal: https://github.com/kubernetes/community/pull/2653
Definitely not the final version. Still collection and documenting all the findings and ideas.

cc @vikaschoudhary16

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet. This also needs a formal KEP to be included.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

/unassign @DirectXMan12

(not sure how I'm still assigned here)

Hi @enxebre, I'm a 1.16 Enhancement Shadow. Is this feature going to be graduating alpha/beta/stable stages in 1.16? Please let me know so it can be added to the 1.16 Tracking Spreadsheet. If it's not graduating, I will remove it from the milestone and change the tracked label.

Once coding begins or if it already has, please list all relevant k/k PRs in this issue so they can be tracked properly.

As a reminder, every enhancement requires a KEP in an implementable state with Graduation Criteria explaining each alpha/beta/stable stages requirements.

Milestone dates are Enhancement Freeze 7/30 and Code Freeze 8/29.

Thank you.

This feature doesn't involve any changes in k/k. Cluster Autoscaler lives in kubernetes/autoscaler repo - not sure how this impacts the process here? Also - there is an ongoing discussion, it's not clear yet if this will be included in 1.16.

Hi @MaciekPytel. We want to see if there is any work being done that will coincide with the 1.16 release. If so, we can make sure it's included as a part of news announcements for major changes the community can look forward to hearing more about.

Hi @MaciekPytel - Enhancment Shadow for 1.17 here -- We want to see if there is any work being done that will coincide with the 1.17 release.

馃敂Friendly Reminder

The current release schedule is

Monday, September 23 - Release Cycle Begins
Tuesday, October 15, EOD PST - Enhancements Freeze
Thursday, November 14, EOD PST - Code Freeze
Tuesday, November 19 - Docs must be completed and reviewed
Monday, December 9 - Kubernetes 1.17.0 Released

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Cluster autoscaler and the cluster api integration still needs to be completed. It seems like this was initially looked at with a sponsor but then dropped off. Maybe this needs a bit more planning since each release it has been skipped over.

Hey there @mitchellmaler @MaciekPytel -- 1.18 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to [alpha|beta|stable] in 1.18?
The current release schedule is:
Monday, January 6th - Release Cycle Begins
Tuesday, January 28th EOD PST - Enhancements Freeze
Thursday, March 5th, EOD PST - Code Freeze
Monday, March 16th - Docs must be completed and reviewed
Tuesday, March 24th - Kubernetes 1.18.0 Released
To be included in the release, this enhancement must have a merged KEP in the implementable status. The KEP must also have graduation criteria and a Test Plan defined.
If you would like to include this enhancement, once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 馃憤
We'll be tracking enhancements here: http://bit.ly/k8s-1-18-enhancements
Thanks!

As a reminder @mitchellmaler @MaciekPytel :

Tuesday, January 28th EOD PST - Enhancements Freeze

Enhancements Freeze is in 7 days. If you seek inclusion in 1.18 please update as requested above.

Thanks!

hi, i brought up this issue at the last cluster-api meeting as i would like to help drive it forward. i am still coming up to speed on the progress, but my understanding is there is a documentation effort that still needs to be addressed?

any updates are greatly appreciated =)

+1

just wanted to add an update here, the initial work to integrate cluster-api into the autoscaler has been completed. see https://github.com/kubernetes/autoscaler/pull/1866

we are now working to add more unit test coverage and increasing the end-to-end tests. we also have plans for code improvements and clean ups, as well as landing a few early bug fixes.

Hi @DirectXMan12 @mitchellmaler @MaciekPytel,

1.19 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating in 1.19?

In order to have this part of the release:

The KEP PR must be merged in an implementable state
The KEP must have test plans
The KEP must have graduation criteria.

The current release schedule is:

Monday, April 13: Week 1 - Release cycle begins
Tuesday, May 19: Week 6 - Enhancements Freeze
Thursday, June 25: Week 11 - Code Freeze
Thursday, July 9: Week 14 - Docs must be completed and reviewed
Tuesday, August 4: Week 17 - Kubernetes v1.19.0 released

Please let me know and I'll add it to the 1.19 tracking sheet (http://bit.ly/k8s-1-19-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 馃憤

Thanks!

As a reminder, enhancements freeze is tomorrow May 19th EOD PST. In order to be included in 1.19 all KEPS must be implementable with graduation criteria and a test plan.

Thanks.

@mwielgus @MaciekPytel i'm curious if there is anything i need to do here?

i'm not sure if we have a written plan for the testing portion of this, we do have unit tests in place and i have a plan to improve e2e around this. should i record this information somewhere?

Unfortunately the deadline for the 1.19 Enhancement freeze has passed. For now this is being removed from the milestone and 1.19 tracking sheet. If there is a need to get this in, please file an enhancement exception.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

the initial integration has happened and we are now working on improvements. happy to take any action to help close this issue.
/remove-lifecycle stale

Hi @mwielgus @MaciekPytel

Enhancements Lead here. Any plans for this in 1.20?

Thanks,
Kirsten

Hi @mwielgus @MaciekPytel

Following up: 1.20 Enhancements Freeze is October 6th. Could you let us know if you have plans for 1.20? I don't see a KEP linked.

To be included in the milestone:
The KEP must be merged in an implementable state
The KEP must have test plans
The KEP must have graduation criteria

Best,
Kirsten

@elmiko is there a KEP written up for this?

@rptaylor i don't think there is. i was introduced to this topic from this issue and the work we've done to integrate capi/autoscaler. should we write a kep to help close out this issue?

I'm don't think we should keep track of this effort here at all. This is a feature of Cluster Autoscaler and not Kubernetes. While Cluster Autoscaler generally tracks Kubernetes releases it doesn't follow the same release process or release schedule. All past proposals/designs for Cluster Autoscaler were discussed via issue in kubernetes/autoscaler repo and the design was merged to https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/proposals.

Finally, provider integration with CA is just a matter of implementing an interface defined by Cluster Autoscaler. We haven't required any kep-like document prior to implementing any other provider integration and I don't think there is any particular need to do so (unless the integration would require some changes in Cluster Autoscaler itself).

I think we should just close this issue.

Finally, provider integration with CA is just a matter of implementing an interface defined by Cluster Autoscaler. We haven't required any kep-like document prior to implementing any other provider integration and I don't think there is any particular need to do so (unless the integration would require some changes in Cluster Autoscaler itself).

I think we should just close this issue.

+1

Any update on closing this issue?

it seems like we have agreement about closing this issue.
/close

@elmiko: Closing this issue.

In response to this:

it seems like we have agreement about closing this issue.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mitar picture mitar  路  8Comments

justinsb picture justinsb  路  11Comments

majgis picture majgis  路  5Comments

liggitt picture liggitt  路  7Comments

xing-yang picture xing-yang  路  13Comments