Test-infra: Auto-close flakes/issues that haven't been seen for > 1 release cycle reopen on re-occurances

Created on 6 Mar 2017  路  24Comments  路  Source: kubernetes/test-infra

Right now I'm simply closing down a bunch of old flakes that haven't been seen in months, or are a set of false positives triggered by some massive infra-flake.

Ideally this should be a bots job to close down flakes that haven't been seen for x-months.

This is a massive waste of community time that should be automated.

As an escape hatch, maintainers could have a flag that would allow a bot to skip over items that they don't want auto-closed.

/cc @derekwaynecarr @kubernetes/test-infra-maintainers @kubernetes/sig-testing-bugs @kubernetes/sig-contributor-experience-bugs @brendanburns

help wanted

All 24 comments

@apelisse wrote a munger to do this for old PRs -- https://github.com/kubernetes/test-infra/blob/master/mungegithub/mungers/close-stale-pr.go an interested person can adapt this to include issues.

FWIW this does not need to waste a lot of anyone's time: the sort button on the issues search includes a "least recently modified" (here is an example) and you can add whatever additional criteria you want. The top left box allows selecting the entire page in one click.

I think the more problematic part here is our org and its sigs needing developing the habit of doing this work on an ongoing basis... as well as a process for reviewing flakes continually instead of only before a release.

I tried doing that, but last updated gets bumped on any change (like the 1.6 milestone being added, for example), not mentions from actual flake occurrances

Plus there have been some wide-spread infra-flakes e.g. kops that posted false positives to a number of flakes.

Good point about corrupting GH's last updated time, which is something the munger handles by ignoring comments from specific users and/or types of updates.

And yes LOTS of useless issues are created

@rmmh @fejta - where does the bot that auto-closes live?
Where exactly can we help with this...?

Why do you need a bot for that? Seems like it will be more efficient to just close them.

Suite failures: https://github.com/kubernetes/kubernetes/search?q=%22broken+test+run%22&state=open&type=Issues&utf8=%E2%9C%93
Test failures: https://github.com/kubernetes/kubernetes/search?q=author%3Ak8s-merge-robot&state=open&type=Issues&utf8=%E2%9C%93

Here's munger code that closes stale PRs: https://github.com/kubernetes/test-infra/blob/master/mungegithub/mungers/close-stale-pr.go
@cjwagner is already refactoring the munger that files issues on test failures

b/c then it would auto-close without me doing anything.

So auto close any flake issue that has been unmodified for 3 months? Sounds great to me.

Is anyone working on this? And do we still agree that it's a good idea? @crimsonfaith91 is looking for a task.

No, it should be a good starter task!

The auto-close stale PRs should work as a template.

Noted! I will work on this as well.
/assign @crimsonfaith91

Actually, we could just extend the stale PR munger without too much work: https://github.com/kubernetes/test-infra/blob/master/mungegithub/mungers/close-stale-pr.go#L283

@rmmh great point! many thanks! :)

@rmmh A quick extension is to remove all checks on whether the object is a PR or an issue. I am also thinking to change naming and messages such as "CloseStalePR" to "CloseStaleObj". Should I do so? If so, the file name has to be changed too.

That sounds good. You'll want to skip findLastHumanPullRequestUpdate too, since that looks for review comments.

Yup! There isfindLastHumanIssueUpdate, which will go over all the comments, including review comments.

ListComments doesn't get review comments-- GitHub has an entirely separate API for that.

I see. I will change the code comments above the findLastHumanIssueUpdate function. In this case, why we want to skip findLastHumanPullRequestUpdate?

Because only PRs have review comments, which is what that function looks for.

I would rename all the functions to be clearer :-)

Only thing that needs agreement is a label name e.g. dont-close that prevents the bot from closing based on staleness.

We can use the keep-open label.

One concern I have is that there are a lot of old issues that still seem interesting, including long-standing design issues.

We might want to start this by only closing stale issues with a kind/flake label. I'd want a wider discussion with the community before closing all the other issues.

If we do the "warning this will be closed in 20-days" I think that is sufficient.

If you do not want this auto-closed apply the keep-open label

I want to see a _lot_ more buy-in before commenting on and closing 3000 issues. That doesn't sound helpful.

Most issues don't have owners like PRs do. Closing a PR for staleness makes sense, since the submitter needs to rebase it frequently, but the same rules don't apply to issues.

Edit: Right, that's the description in the first comment. Closing flakes. :smile:

disagree, if the person cares they can reopen or apply the label. If the bot gives instructions on how to apply the label so non-maintainers can do it then I think we're good.

Right now we have a sustainability problem.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

BenTheElder picture BenTheElder  路  4Comments

spiffxp picture spiffxp  路  3Comments

benmoss picture benmoss  路  3Comments

cjwagner picture cjwagner  路  3Comments

Aisuko picture Aisuko  路  3Comments