I got I think 200+ github notification emails (split into 3 gmail threads) containing prow updates for https://github.com/kubernetes/kubernetes/pull/77341, all at Sat, May 4, 12:43 AM. I'm not sure if it's a github problem or a prow problem. I didn't see this on any other PRs.
What sort of updates were they?
It looks like the PR was updated ~25 times with force-pushes at the time, which would have kicked off new rounds of testing each time and may have resulted in job results being posted back for those. If we can narrow down which actions the bot was taking at the time we might be able to see if there was some errant behavior on top of that.
They were all seemingly the same, reporting all the tests had failed.
e.g.
Test name | Commit | Details | Rerun command
-- | -- | -- | --
pull-kubernetes-e2e-gce-csi-serial | 094de85 | link | /test pull-kubernetes-e2e-gce-csi-serial
pull-kubernetes-typecheck | 094de85 | link | /test pull-kubernetes-typecheck
pull-kubernetes-bazel-test | 094de85 | link | /test pull-kubernetes-bazel-test
pull-kubernetes-dependencies | 094de85 | link | /test pull-kubernetes-dependencies
pull-kubernetes-kubemark-e2e-gce-big | 094de85 | link | /test pull-kubernetes-kubemark-e2e-gce-big
pull-kubernetes-verify | 094de85 | link | /test pull-kubernetes-verify
pull-kubernetes-e2e-gce | e93bba7 | link | /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-gce-device-plugin-gpu | e93bba7 | link | /test pull-kubernetes-e2e-gce-device-plugin-gpu
pull-kubernetes-node-e2e | e93bba7 | link | /test pull-kubernetes-node-e2e
pull-kubernetes-e2e-gce-100-performance | e93bba7 | link | /test pull-kubernetes-e2e-gce-100-performance
pull-kubernetes-e2e-gce-storage-slow | e93bba7 | link | /test pull-kubernetes-e2e-gce-storage-slow
pull-kubernetes-integration | e93bba7 | link | /test pull-kubernetes-integration
pull-kubernetes-bazel-build | e93bba7 | link | /test pull-kubernetes-bazel-build
I do notice now that the hash is different on some of them. If we sent ~8 updates per force push, then 25 pushes could turn into 200+ emails. Maybe this is WAI?
It would be nice if I had a way to automatically filter these notification messages, e.g. some unique string appeared that I could write a filter for.
Yeah I think this is all of the jobs failing as they try to clone commits that no longer exist as they were force-pushed over, and this is a symptom of Prow trying to be very clear about test results happening.
I think it's yet another question about notification sending.
/cc @cjwagner @fejta @Katharine @BenTheElder
It would be nice if I had a way to automatically filter these notification messages, e.g. some unique string appeared that I could write a filter for.
this is a brilliant idea.
if we put specific fixed strings within different classes of prowbot comments then users could filter them out client side.
FWIW this is one of my Gmail filters now:
Matches: from:(Kubernetes Prow Robot) The following test failed, say /retest to rerun them all:
Do this: Mark as read, Never mark it as important
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Closing in favour of kubernetes/community#3621
Most helpful comment
this is a brilliant idea.
if we put specific fixed strings within different classes of prowbot comments then users could filter them out client side.
xref: https://github.com/kubernetes/community/issues/3621