SIGs such as the cloud providers operate like user-groups. The SIGs are responsible for tight integrations of provider APIs with Kubernetes APIs/Interfaces such as out-of-tree cloud provider interfaces, CSI, CNI, CRI specifications etc. These integrations satisfy a Kubernetes user's needs when running Kubernetes on a cloud provider (not necessarily motivated to enhance a specific implementation of Kubernetes such as EKS or AKS or GKE or VKE). These integrations contribute to the richness of the Kubernetes ecosystem and aim to drive consistent behavior through the interface implementation across providers. It is therefore in the best interest of the Kubernetes user that such implementations/integrations be pegged to a Kubernetes major version release and the testing, documentation discipline enforced on the in-tree Kubernetes features be adopted/recommended for the out-of-tree feature releases as well.
This leads to a couple of open questions that need to be resolved/discussed since no reference able guidelines exists today. Referring specifically to this issue in v1.13
Tracking out-of-tree Kubernetes features - should this be out of scope for the SIG-release team and in scope for the responsible SIG?
Cadence of out-of-tree feature releases - should SIGs continue to adhere to release best practices that Kubernetes in-tree features follow aka follow the requirements necessary with testing and documentation to move the feature from alpha to beta to GA?
Reporting out-of-tree feature updates - should a summary for out-of-tree features continue to be added to the Major Theme section of the release notes as described here
Publishing test results in testgrid - should the integration test results be post-submit, non-blocking and be visible in testgrid for an out-of-tree feature to qualify as beta/GA?
Documentation - should the documentation be regularly updated for a feature to move through the release cadence?
KEPs - should KEPs be necessarily updated for a feature to move through the release cadence?
Release cadence ownership - should the SIG Chairs be responsible for the quality of the releases and the systematic follow through on the release cadence?
/cc @saad-ali @justaugustus @thockin @justinsb @spiffxp @tpepper @dims @andrewsykim
Created the issue as discussed!
/assign @justaugustus
Pinging @kubernetes/sig-release for input as well...
Just speaking out loud here. Maybe features that touch the core tree should follow the same enhancement processes until they are fully removed? Anything fully "out-of-tree" can have its own cadence? Could be an added incentive to move more things out-of-tree.
I've written some thoughts below-please keep in mind that they are my own opinions only and that they might come with a somewhat incomplete context:)
tl;dr: My instinct would be to move as much of the ownership as possible to the owning SIGs/working groups, while making sure that the Kubernetes thing (previously built off core, now core+out-of-tree) that end users deploy and run is of high quality.
1. Tracking out-of-tree Kubernetes features - should this be out of scope for the SIG-release team and in scope for the responsible SIG?
I could see SIGs being more effective at owning out-of-tree features than SIG-release. They know the complexity and work-to-be-done best, and so can speak more effectively to milestones and maturity. At the same time, it probably makes sense to have some sort of standardization/shared understanding, among SIGs, of what tracking looks like. This could be through process, through a central coordinating team (sig-release/sig-pm?) or something else...
2. Cadence of out-of-tree feature releases - should SIGs continue to adhere to release best practices that Kubernetes in-tree features follow aka follow the requirements necessary with testing and documentation to move the feature from alpha to beta to GA?
If I read this correctly, there are 3 things bundled into this prompt.
a. Should out-of-tree features follow the same requirements as in-tree ones, in terms of testing:
This one is a strong "yes" for me, if not even higher standards. My reasoning is that for end users, whether features are developed in- or out-of- tree is an implementation detail. The quality of the software they run should be the same, regardless of how the code and development process is structured.
b. Should out-of-tree features follow the same requirements as in-tree ones, in terms of documentation:
c. Cadence of out-of-tree feature releases (i.e. what should the cadence of out-of-tree releases be relative to in-tree)
3. Reporting out-of-tree feature updates - should a summary for out-of-tree features continue to be added to the Major Theme section of the release notes
_Caveat: I'm not a docs expert[0]! Extra grain of salt here!_
My view is that:
Major Themes section in the out-of-tree components' Release Notes4. Publishing test results in testgrid - should the integration test results be post-submit, non-blocking and be visible in testgrid for an out-of-tree feature to qualify as beta/GA?
non-blocking: I understand this to mean non-Kubernetes-release-blocking (as opposed to non-out-of-tree-feature-blocking, is that right? If yes, non-blocking and visible in testgrid makes sense. In my head, an out-of-tree developed component is sort of a consumer of in-tree features and APIs, so at the very least test results for out-of-tree components are kubernetes-release-informing, and occasionally even blocking (if they unearth unexpected incompatibilities, for example).5. Documentation - should the documentation be regularly updated for a feature to move through the release cadence?
6. KEPs - should KEPs be necessarily updated for a feature to move through the release cadence?
For me, strong "yes" to both of these points. The fact that features are developed out-of-tree does not by itself make them any less important than in-tree developed features. So it makes sense to capture the reasoning/intention (KEP) and the end functionality (user-facing documentation) with a similar amount of detail.
7. Release cadence ownership - should the SIG Chairs be responsible for the quality of the releases and the systematic follow through on the release cadence?
Yes; at the same time
responsible and quality cover in this context[0] Or an expert of any kind now that I think of it, so just throw salt all over:)
cc @hoegaarden
@mariantalla - Thanks for the well thought out response.
@tpepper - can you please share your thoughts on this issue? Its about release cadence for out-of-tree features which is relevant to all cloudprovider features per se'
FYI in this week's SIG Release meeting we had a lot of discussion on this topic.
Meeting notes and recording are online.
Some personal thoughts and observations:
/sig pm release
/milestone v1.15
/priority critical-urgent
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle rotten
/priority important-soon
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle rotten
/lifecycle frozen
/priority important-longterm
/remove-priority critical-urgent
/sig architecture
@neolit123 is picking this topic up in the context of kubeadm
As an incremental step forward, I've added a tracked/out-of-tree label to k/enhancements, which can be used by the @kubernetes/release-team when an enhancement is out-of-tree and doesn't need to be tracked by the team.
Heya, lots of great points and perspectives made here. With that in mind, is it possible to identify what the general goals and next action items might be?
Worth noting that we have a current proposal for this topic w.r.t external cloud providers https://github.com/kubernetes/enhancements/pull/1727
Most helpful comment
I've written some thoughts below-please keep in mind that they are my own opinions only and that they might come with a somewhat incomplete context:)
tl;dr: My instinct would be to move as much of the ownership as possible to the owning SIGs/working groups, while making sure that the Kubernetes thing (previously built off core, now core+out-of-tree) that end users deploy and run is of high quality.
I could see SIGs being more effective at owning out-of-tree features than SIG-release. They know the complexity and work-to-be-done best, and so can speak more effectively to milestones and maturity. At the same time, it probably makes sense to have some sort of standardization/shared understanding, among SIGs, of what tracking looks like. This could be through process, through a central coordinating team (sig-release/sig-pm?) or something else...
If I read this correctly, there are 3 things bundled into this prompt.
a. Should out-of-tree features follow the same requirements as in-tree ones, in terms of testing:
This one is a strong "yes" for me, if not even higher standards. My reasoning is that for end users, whether features are developed in- or out-of- tree is an implementation detail. The quality of the software they run should be the same, regardless of how the code and development process is structured.
b. Should out-of-tree features follow the same requirements as in-tree ones, in terms of documentation:
c. Cadence of out-of-tree feature releases (i.e. what should the cadence of out-of-tree releases be relative to in-tree)
_Caveat: I'm not a docs expert[0]! Extra grain of salt here!_
My view is that:
Major Themessection in the out-of-tree components' Release Notesnon-blocking: I understand this to meannon-Kubernetes-release-blocking(as opposed tonon-out-of-tree-feature-blocking, is that right? If yes,non-blocking and visible in testgridmakes sense. In my head, an out-of-tree developed component is sort of a consumer of in-tree features and APIs, so at the very least test results for out-of-tree components are kubernetes-release-informing, and occasionally even blocking (if they unearth unexpected incompatibilities, for example).For me, strong "yes" to both of these points. The fact that features are developed out-of-tree does not by itself make them any less important than in-tree developed features. So it makes sense to capture the reasoning/intention (KEP) and the end functionality (user-facing documentation) with a similar amount of detail.
Yes; at the same time
responsibleandqualitycover in this context[0] Or an expert of any kind now that I think of it, so just throw salt all over:)
cc @hoegaarden