Test-infra: No method to track `/retest` usage, a helpful metric for flake indication in k8s

Created on 23 Jul 2020 · 10Comments · Source: kubernetes/test-infra

What would you like to be added:
A way to query and track how often retest and maybe some other triggers are called on k8s PRs.

Why is this needed:
As part of the effort to extend useful metrics around flake impact and shed some light on resource use caused by retest, it would be nice to have some tracking.

/assign

kinfeature lifecyclrotten sicontributor-experience

Source

MushuEE

👍2

Most helpful comment

@MushuEE Did you already have an approach in mind for this (I see that you assigned yourself)? This looks like it could be a good candidate for devstats as well - https://github.com/cncf/devstats - and should be easier to handle there.

cc @LappleApple @mrbobbytables

nikhita on 28 Jul 2020

👍2 ❤1

All 10 comments

/sig contributor-experience
/cc @nikhita

dims on 28 Jul 2020

cc @LappleApple @mrbobbytables

nikhita on 28 Jul 2020

👍2 ❤1

Hi @nikhita, I've added this issue to my DevStats notes. Currently have a next-step in mind around devstats and am discussing it with a few folks.

LappleApple on 28 Jul 2020

@LappleApple @nikhita I was looking at the devstats and reached out asking if we had those metrics already. I think Devstats is the best place for WebHook data. Keep me in the loop! I am working on flake reporting right now and would be happy to add whatever we need to start tracking retest metrics. Not sure how to do this yet however, need a little bit of time to dig deeper. @cjwagner, pointed me to devstats originally. I will start looking how to push metrics there. The List of requested metrics looks long.

MushuEE on 28 Jul 2020

👍1

would be happy to add whatever we need to start tracking retest metrics. Not sure how to do this yet however, need a little bit of time to dig deeper.

In the past, we've created an issue in the devstats repo asking for new dashboards/metrics and the maintainer has created them for us. @LappleApple has been representing kubernetes' feature requests and concerns to devstats these days, so I think she can probably point to next steps better. :)

nikhita on 28 Jul 2020

Thank you @nikhita . Hi @MushuEE -- I've actually got a series of sessions on DevStats set up, first one was yesterday (three more scheduled). I'll send you the info (hopefully it's time zone-compatible) and you might consider joining us. :)

LappleApple on 28 Jul 2020

@LappleApple Thank you! Speaking with @BenTheElder it seems that the behavior of /retest might be quite a noisy metric and there might be some better metrics such as if tests change state after /retest is run on the same HASH. More direct flake indication. Apparently a lot of the /retests are triggered automatically

MushuEE on 28 Jul 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 26 Oct 2020

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 25 Nov 2020

Going to close this in favor of a future metrics effort

MushuEE on 30 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings