Enhancements: Redesign Event API

Created on 4 Aug 2017  ·  110Comments  ·  Source: kubernetes/enhancements

Feature Description

  • One-line feature description (can be used as a release note): Add more structure to Event API and change deduplication logic so Events won't overload the cluster
  • Primary contact (assignee): @gmarek
  • Responsible SIGs: instrumentation
  • KEP: new-event-api-ga-graduation
  • Design proposal link (community repo): design goolge doc - design discussions in github are too painful for me
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred: @timothysc @wojtek-t
  • Approver (likely from SIG/area to which feature belongs): @bgrant0607 @thockin @countspongebob
  • Feature target (which target equals to which milestone):

    • Beta: 1.8 [done]

    • GA: 1.19 [done]

kinfeature siinstrumentation stagstable

Most helpful comment

The KEP has been basically implemented. But there is also a migration of existing components that needs to happen (which is kind of mechanical and doesn't require any design, but has to happen).

So the recommendation, I think was (and I actually like it):

  • mark KEP as implemented
  • keep this issue opened to track the migration of existing components (i.e. add a list of components here and use it to track it in this issue).

@chelseychen - can you please create such a list (and mark KEP as implemented)?

All 110 comments

@gmarek the feature submission deadline has passed (Aug 1). Please, submit a feature exception (https://github.com/kubernetes/features/blob/master/EXCEPTIONS.md) to have this feature present in 1.8 release.

@gmarek as you probably saw in the other feature comments, I am trying to understand how some features didn't get into the features repo before the deadline. This is only for the purpose of improving our release process and notifications for next time, not for blaming or pointing fingers. We're also trying to understand if there was prior work done on the feature, or if it was created after the freeze date.

Yup, there's quite some work done on this, with (big) design doc shared with kubernetes-dev and in-depth discussion on SIG scale. For this we'll probably make it disabled by default, as there's not enough time to let it soak. Is it possible to have a 'quiet' release for things like that? @jdumars

@gmarek that's an interesting question. My personal opinion is to provide as much transparency as possible, so we maintain a bond of trust with our user community. Being as you get to write the release notes, you can add something short there about it. And, thanks for clarifying the feature itself.

Personal perspective on this, largely repeating comments I've made before. But, as this is a case in point...

  • SIG PM involvement and feature submissions have been functionally optional, with SIG PM not empowered to actually keep things out of a release.
  • There is continued confusion over what is a feature. I echo @jbeda in calling for these to be renamed "efforts". The implication would be 100% coverage, but see my first point.

We had a discussion in SIG Scalability about especially point #2 with no clear resolution. A few of lobbied @gmarek to do the feature submission not withstanding the points above and he agreed to to do.

@jdumars @countspongebob @gmarek the main point to discuss here - is about the formal dates and deadlines, and what will happen if one will avoid them. We have agreed that the feature freeze for 1.8 (https://github.com/kubernetes/features/blob/master/release-1.8/release-1.8.md) is August 1, so all the features have to be submitted to the features repo before this date.

If people, responsible for the release and the overall community feel that this deadline is not mandatory, it can be discussed and removed. From our (PM group) standpoint, the feature freeze is necessary from the high-level point of view (including planning of the roadmap, marketing activities, etc.). But if there are some reasons why we shouldn't have a feature freeze, again, let's discuss them.

PS. It has been a long-discussed question in the community, even before SIG-PM was established. Now it might be a good time to solve it.

@countspongebob

SIG PM involvement and feature submissions have been functionally optional, with SIG PM not empowered to actually keep things out of a release.

SIG PM is not empowered, but release team is. SIG PM is responsible for managing the features on the high level, so we would be able to provide release team with the clearest and transparent information about the feature.

@idvoretskyi - IIUC the exception process is a SIG-PM thing. I haven't heard complains from release team about developing features that are not enabled and don't impact current behavior (plus it's highly unlikely it will be finished in 1.8 timeframe). I'm happy to discuss it as soon as any doubts appear.

Please correct me if I'm wrong - the goal is to track features that will ship in a current release, not the development process that may span across multiple releases. If I'm not mistaken this means that "features" (for lack of the better word) that are disabled and not ready to be enabled don't need to be tracked, right?

Also note that there's not clear what constitutes a 'feature' and where's the border between new feature and 'improvement' that doesn't need a feature repo issue.

Slight OT, but related to shipping features - it was widely acknowledged that @kubernetes/sig-scalability-misc have power to block features which cause performance degradation bad enough to make Kubernetes clusters break our performance SLOs (this is of course decided together with the release team). This is decided close to the release dates, when scale tests on a given release are finished. I'm saying this to make clear that feature repo can't be treated as source of truth about features that will ship in a given release.

@gmarek any plans to continue development of this item for 1.9?

@bgrant0607 perfect. Updating the milestone.

PR is also ready for review (not started because of 1.8): kubernetes/kubernetes#49112

@gmarek can you confirm that it's alpha for 1.9?

@gmarek :wave: Please indicate in the 1.9 feature tracking board
whether this feature needs documentation. If yes, please open a PR and add a link to the tracking spreadsheet. Thanks in advance!

Gah, Github close links :(

The reference links are still ugly as hell (mainly caused by repeated runs due to bot development). But I switched the bot to run under the k8s-publishing-bot user now which cannot close issues anymore. @github please fix your reference mechanism 🙏

/cc @mhagger

Has anyone shown this issue to GitHub?? The number of spurious closures is ridiculous. 😕
xref: https://github.com/kubernetes/test-infra/issues/5032
Edit: I see stts has above... 😕

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

What's the status for 1.10 and following?

@gmarek
Any plans for this in 1.11?

If so, can you please ensure the feature is up-to-date with the appropriate:

  • Description
  • Milestone
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

cc @idvoretskyi

cc @wojtek-t

This feature current has no milestone, so we'd like to check in and see if there are any plans for this in Kubernetes 1.12.

If so, please ensure that this issue is up-to-date with ALL of the following information:

  • One-line feature description (can be used as a release note):
  • Primary contact (assignee):
  • Responsible SIGs:
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):

    • Alpha release target (x.y)

    • Beta release target (x.y)

    • Stable release target (x.y)

Set the following:

  • Description
  • Assignee(s)
  • Labels:

    • stage/{alpha,beta,stable}

    • sig/*

    • kind/feature

Once this feature is appropriately updated, please explicitly ping @justaugustus, @kacole2, @robertsandoval, @rajendar38 to note that it is ready to be included in the Features Tracking Spreadsheet for Kubernetes 1.12.


Please note that Features Freeze is tomorrow, July 31st, after which any incomplete Feature issues will require an Exception request to be accepted into the milestone.

In addition, please be aware of the following relevant deadlines:

  • Docs deadline (open placeholder PRs): 8/21
  • Test case freeze: 8/28

Please make sure all PRs for features have relevant release notes included as well.

Happy shipping!

P.S. This was sent via automation

Hi
This enhancement has been tracked before, so we'd like to check in and see if there are any plans for this to graduate stages in Kubernetes 1.13. This release is targeted to be more ‘stable’ and will have an aggressive timeline. Please only include this enhancement if there is a high level of confidence it will meet the following deadlines:

  • Docs (open placeholder PRs): 11/8
  • Code Slush: 11/9
  • Code Freeze Begins: 11/15
  • Docs Complete and Reviewed: 11/27

Please take a moment to update the milestones on your original post for future tracking and ping @kacole2 if it needs to be included in the 1.13 Enhancements Tracking Sheet

Thanks!

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@gmarek Hello - I’m the enhancement’s lead for 1.14 and I’m checking in on this issue to see what work (if any) is being planned for the 1.14 release. Enhancements freeze is Jan 29th and I want to remind that all enhancements must have a KEP

cc @yastij

We are trying to come up with a solution to mitigate the potential performance issue. Depending on the result of that I can imagine this hitting Alpha or not in 1.14 timeframe.

@wojtek-t checking in as enhancements freeze is next week - are there plans to try to get this in 1.14 as alpha? If so is there a KEP for this issue?

Thanks @bgrant0607 - for 1.14 we want everything to have a KEP - for enhancements freeze if this KEP is mostly linking to the design proposal that is fine as long as things like test plan and graduation criteria spelled out.

Additionally are there any open PRs that should be tracked for 1.14? Thanks

@claurence ref: kubernetes/kubernetes#65782

Checking in to see if there is a KEP for this or an open PR for a KEP? I know there is an existing design proposal but we would like it converted to a KEP for 1.14.

@gmarek @bgrant0607 since there is no KEP for this issue yet we will be removing it from the 1.14 milestone. To have it added back in please file an exception - information on the exception process can be found here: https://github.com/kubernetes/sig-release/blob/master/releases/EXCEPTIONS.md

@claurence:
@yastij is now working on this

@claurence - a KEP is out, ref: #796

@claurence - can you add this the 1.14 milestone ? as the KEP is implementable.

Hi @yastij , Enhancement Shadow 1.14 here, i'll take care of that no problem

@yastij Thanks! Added back to 1.14

Hi @yastij I'm one of the v1.14 docs release shadows.

Does this enhancement require any new docs (or modifications)?

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.14) due by Friday, March 1. It would be great if it's the start of the full documentation, but even a placeholder PR is acceptable. Let me know if you have any questions!

@yastij looking over the KEP I don't see any testing plans - can someone help PR in testing plans for this enhancement? This information is helpful for knowing readiness of this feature for the release and is specifically useful for CI Signal.

If we don't have testing plans this enhancement will be at risk for being included in the 1.14 release

@claurence - Sure, I'll add some of the testing plans for this on the KEP.

Hello @gmarek and @yastij , Code Freeze is March 7th and all PRs must be merged by then to your issue to make the 1.14 release. What open K/K PRs do you still have that need to merge? Thanks

I think we need to move it out of 1.14 - we won't be ready for it.

I agree with @wojtek-t we won’t be ready for it.

thanks for the update @yastij and @wojtek-t, good luck, see you on 1.15 :)

Hello @wojtek-t @yastij , I'm the Enhancement Lead for 1.15. Is this feature going to be graduating alpha/beta/stable stages in 1.15? Please let me know so it can be tracked properly and added to the spreadsheet.

Once coding begins, please list all relevant k/k PRs in this issue so they can be tracked properly.

Yes - we would like to push it in 1.15 - we will keep it up-to-date.

/assign @gmarek

/assign @wojtek-t @yastij

As we’re working on this

Hey, @gmarek @yastij @wojtek-t 👋 I'm the v1.15 docs Lead.
Does this enhancement require any new docs (or modifications)?

Just a friendly reminder we're looking for a PR against k/website (branch dev-1.15) due by Thursday, May 30th. It would be great if it's the start of the full documentation, but even a placeholder PR is acceptable. Let me know if you have any questions

Hi @gmarek @yastij @wojtek-t. Code Freeze is Thursday, May 30th 2019 @ EOD PST. All enhancements going into the release must be code-complete, including tests, and have docs PRs open.

Please list all current k/k PRs so they can be tracked going into freeze. If the PRs aren't merged by freeze, this feature will slip for the 1.15 release cycle. Only release-blocking issues and PRs will be allowed in the milestone.

If you know this will slip, please reply back and let us know. Thanks!

@kacole2 - yes, we're aware of the deadline.
The code (including tests) is almost ready. From not-yet merged PRs we need:
https://github.com/kubernetes/kubernetes/pull/78482 [already approved]
https://github.com/kubernetes/kubernetes/pull/78037 [already approved]
https://github.com/kubernetes/kubernetes/pull/78486
https://github.com/kubernetes/kubernetes/pull/78447

@yastij is working on the last 2 to address comments, we hope to have the remaining two approved pretty soon too

Hi @wojtek-t will this need any docs modification? if yes would be nice to have a placeholder PR before code freeze too.

@makoscafee - This enhancement doesn't need docs

@wojtek-t I see that https://github.com/kubernetes/kubernetes/pull/78447 wasn't merged before freeze. Is this going to prohibit going Alpha in 1.15?

@kacole2 - kubernetes/kubernetes#78447 uses the feature and isn't part of the implementation.

The feature has tests and integration tests, so this shouldn't be a blocker.

Hi @yastij , I'm the 1.16 Enhancement Lead. Is this feature going to be graduating alpha/beta/stable stages in 1.16? Please let me know so it can be added to the 1.16 Tracking Spreadsheet. If not's graduating, I will remove it from the milestone and change the tracked label.

Once coding begins or if it already has, please list all relevant k/k PRs in this issue so they can be tracked properly.

Milestone dates are Enhancement Freeze 7/30 and Code Freeze 8/29.

Thank you.

No - we don't assume any graduation this cycle.
We would like to use new events API (in beta) in couple places in the codebase to prove it before we graduate to GA.
Tentative plan is to graduate in 1.17.

@deads2k - I don't think this was supposed to be closed :)

Hello @wojtek-t / @yastij -- 1.17 Enhancement Shadow here! 🙂

I wanted to reach out to see if this enhancement will be graduating to beta/stable in 1.17?


Please let me know so that this enhancement can be added to 1.17 tracking sheet.

Thank you!

🔔Friendly Reminder

The current release schedule is

  • Monday, September 23 - Release Cycle Begins
  • Tuesday, October 15, EOD PST - Enhancements Freeze
  • Thursday, November 14, EOD PST - Code Freeze
  • Tuesday, November 19 - Docs must be completed and reviewed
  • Monday, December 9 - Kubernetes 1.17.0 Released

Yes - we're targeting Beta for 1.17.

@wojtek-t - Enhancement lead here. I hate to be a stickler, but the design proposal needs to be converted to a KEP. The current KEP has graduation criteria but lacks any other details other than linking back to the design proposal =/

@yastij - can you please take a look ^^

@wojtek-t - i'll convert it to a proper KEP

Thanks @yastij , Looking forward to the KEP!

Hi @yastij -- We're only 5 days away from the Enhancements Freeze (Tuesday, October 15, EOD PST). Another friendly reminder that to be able to graduate this in the 1.17 release, KEP must have graduation criteria defined, in an implementable state, and merged in.

@wojtek-t - I'm drafting the PR today, it'll mostly reformat the design proposal + add the graduation criteria. can you queue it for review once opened ?

Hey @yastij / @wojtek-t , unfortunately deadline for 1.17 enhancement freeze has passed and looks like the KEP has not been filed. I will be removing this enhancement from the 1.17 milestone.

Please note that you can file an enhancement exception if you need to get this in for 1.17

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

@yastij are you targeting this for 1.18? The KEP will need to be approved and in implementable state by the enhancements freeze date. The release schedule is:

Monday, January 6th - Release Cycle Begins
Tuesday, January 28th EOD PST - Enhancements Freeze
Thursday, March 5th, EOD PST - Code Freeze
Monday, March 16th - Docs must be completed and reviewed
Tuesday, March 24th - Kubernetes 1.18.0 Released

/milestone v1.19

You might be interested in https://github.com/kubernetes/kubernetes/issues/85544 as well.

Would be good if CloudEvents could be taken into consideration for this.

I've gone through the proposal and think it certainly makes sense.

My only additional input would be :

  1. Make it CloudEvents compliant ( kubernetes/kubernetes#85544 )
  2. Ensure the events have the concept of an "event type" on which can be filtered, ie. DeploymentScaledOut, which is a unique type for that event and has a payload schema tied to it so users know how to process it.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Actually, I just realized that we have two KEPs (and issues) opened for the same thing.

https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1661-new-event-api-ga-graduation is a duplication.

@chelseychen - we should deduplicate, probably living this one as it has more history.

Hi @wojtek-t -- 1.19 Enhancements Lead here, I wanted to check in if you think this enhancement would graduate in 1.19?


The current release schedule is:

  • Monday, April 13: Week 1 - Release cycle begins
  • Tuesday, May 19: Week 6 - Enhancements Freeze
  • Thursday, June 25: Week 11 - Code Freeze
  • Thursday, July 9: Week 14 - Docs must be completed and reviewed
  • Tuesday, August 4: Week 17 - Kubernetes v1.19.0 released
  • Thursday, August 20: Week 19 - Release Retrospective

@palnabarun - yes, we're targeting GA/Stable in 1.19. Please add to the tracking sheet.
[I also updated the link to KEP, the KEP was updated recently.]

Thank you @wojtek-t for the update. I will update the tracking sheet accordingly. :+1:

Also, thanks for updating the KEP according to the new template. :slightly_smiling_face:

/stage stable
/milestone v1.19

Hi @wojtek-t 👋 1.19 docs shadow here! Does this enhancement work planned for 1.19 require new or modification to docs?

Friendly reminder that if new/modification to docs are required, a placeholder PR against k/website (branch dev-1.19) are needed by Friday, June 12.

Hey @gmarek, I am with the enhancements team for the v1.19 release cycle as a shadow.

The code freeze deadline for the Enhancement is Thursday, June 25. I am checking in to see if there is any k/k PR that you have already opened for this enhancement and if so, would you be able to point me in the direction of the PR so that the same can be updated in the tracking sheet

Have a wonderful day. 🖖

Hi @wojtek-t hope you're doing well, checking in again to see if docs are required for this or not. Could you confirm?

@annajung - given we will only promote the API itself without migrating new components to use, nothing really changes for users or developers yet comparing to where we are now.
So "no doc updates are needed."

Great, thank you for the update. I'll update the tracking sheet accordingly 👍

Hey @gmarek, @wojtek-t, Hope things are good.

The code freeze deadline for the Enhancement is Thursday, June 25. So, I am following up on my previous updates about the k/k that needs to be tracked. Is there a PR against that which needs to be tracked for this enhancement?

Have a wonderful day. 🖖

@chelseychen ^^

Hey @wojtek-t, Thanks for following up with the links. I've updated these PRs in the tracking sheet.

Have a wonderful day. 🖖

Hi, @wojtek-t

This is a follow-up to the communication that went out to k-dev today. There has been a revision to the release schedule of v1.19 as follows.

Thursday, July 9th: Week 13 - Code Freeze
Thursday, July 16th: Week 14 - Docs must be completed and reviewed
Tuesday, August 25th: Week 20 - Kubernetes v1.19.0 released
Thursday, August 27th: Week 20 - Release Retrospective

You can find the revised Schedule in the sig-release Repo

Please let me know if you have any questions. 🖖

To update before upcoming code-freeze:
https://github.com/kubernetes/kubernetes/pull/91798
https://github.com/kubernetes/kubernetes/pull/92082
and
e2e test: https://github.com/kubernetes/kubernetes/pull/92607
are already merged

what is missing are:
https://github.com/kubernetes/kubernetes/pull/92662 (some failing tests to debug/fix)
https://github.com/kubernetes/kubernetes/pull/92724 (migrating e2e test to use the v1 api instead of v1beta1)
API PR: https://github.com/kubernetes/kubernetes/pull/91645 (already approved, waiting for the previous two to be ready)

Hi @wojtek-t !

Enhancements lead here. Can you please confirm the status of this KEP? Did it graduate in 1.19 or will there be work in 1.20? If it did indeed graduate, please update the KEP to reflect a status of implemented. After that is done and merged, we can then close this issue as we no longer need to track it.

Thanks!
Kirsten

I'm actually wondering what to do with this one. Technically the API has been graduated to GA, but we were also tracking the migration as part of it, so it makes sense to keep it open I think.

@chelseychen - will be looking for continueing the migration so let's track it for 1.20

/milestone v1.20

@wojtek-t Can you clarify? Should this then be tracked as graduating to stable in 1.20 since there is an outstanding migration?

I'm on the fence. But reading through the KEP again, it's pretty much about the API not the migration itself.
So maybe we should actually close that as implemented.

Hi @wojtek-t

Should we be tracking this or will you mark as implemented in 1.19?

Thanks!
Kirsten

(For some context, this was discussed in the 09/03 SIG Instrumentation meeting: https://youtu.be/AzmExKmbQCM)

Hi all!

Thanks @ehashman for the link. It seems to imply that there is more work to be done? But I'm confused...

Can we please clarify the state of this KEP? It was tracked as stable for 1.19:
https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/383-new-event-api-ga-graduation/kep.yaml

The existing KEP has graduation criteria to GA listed:
https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/383-new-event-api-ga-graduation/README.md#beta-to-ga-graduation

Were all of the above criteria met?

I'm asking bc I'm not sure how to track a post-GA KEP or if a KEP that is GA should remain open as it wouldn't have graduation criteria..

Or should the migration be tracked via a new KEP just to clarify things?

Thanks!
Kirsten

The KEP has been basically implemented. But there is also a migration of existing components that needs to happen (which is kind of mechanical and doesn't require any design, but has to happen).

So the recommendation, I think was (and I actually like it):

  • mark KEP as implemented
  • keep this issue opened to track the migration of existing components (i.e. add a list of components here and use it to track it in this issue).

@chelseychen - can you please create such a list (and mark KEP as implemented)?

Hi there,

I've created a PR #2014 to update the KEP.

The remaining work is to do a load test and migrate all existing components and tests to use new Events API.

The list of components is shown as follows:

  • kubelet
  • kube-controller-manager
  • leader election
  • node problem detector
  • gce ingress controller
  • event exporter

Please feel free to add items in case I miss something :)

Thanks!

We should probably explicitly split them into in-tree and out-of-tree (I'm sure there are way more out-of-tree).
Also cloud-controller-manager is missing from the list for in-tree.

Since the KEP is implemented, we are essentially using this issue was a tracking issue now.

The Enhancements team doesn't have anything to track, so I'm going to remove it from our sheet (as we do with all implemented) and I'll check in at the end of the release cycle to see how this is going. When you're finished with all of your migrations, you can close this issue.

One request: Can we update the description to add milestones and clearly indicate this went GA in 1.19?

Sound good to everyone?

Yes - that sounds good - there seem to be anything remaining for you to track. We will close this issue once done (we will try to target 1.20).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

prameshj picture prameshj  ·  9Comments

robscott picture robscott  ·  11Comments

AndiLi99 picture AndiLi99  ·  13Comments

msau42 picture msau42  ·  13Comments

povsister picture povsister  ·  5Comments