Test-infra: Prow "Job History" Spammed with 0s pending runs

Created on 27 Oct 2020  路  4Comments  路  Source: kubernetes/test-infra

What happened:
Deck Job History is flooded with jobs that look like

0001-01-01 00:00:00 +0000 UTC 0s

for pull-kubernetes-e2e-gce

image

What you expected to happen:
Only true runs show in job history.

How to reproduce it (as minimally and precisely as possible):
View Job History link above.

Please provide links to example occurrences, if any:
https://prow.k8s.io/job-history/gs/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-e2e-gce

Anything else we need to know?:

/cc @chaodaiG

areprow areprodeck kinbug

Most helpful comment

/assign

All 4 comments

/assign

Quick update
It seems to me that some logs are not properly storaged, when Deck try to retrieve data of some logs it gets an error (Which is caught but not handled), the builds that raises this error are those showing with 0s and in pending state in Job History. Notice the failed to resolve sym link part,
{"component":"unset","error":"failed to resolve sym link: failed to read pr-logs/directory/pull-kubernetes-e2e-gce/1323213047505883136.txt: creating reader for object pr-logs/directory/pull-kubernetes-e2e-gce/1323213047505883136.txt: storage: object doesn't exist","file":"prow/cmd/deck/job_history.go:479","func":"main.getJobHistory.func1","level":"error","msg":"Other error","severity":"error","time":"2020-11-02T15:09:18+01:00"}

Trying to access the object directly indeed it doen't exists: https://storage.googleapis.com/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-e2e-gce/1323213047505883136.txt

On other objects a sym-link to a build.text is shown an thus raises no errors: https://storage.googleapis.com/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-e2e-gce/1322998475507372032.txt

Should we not be pushing empty build data? I presume that is what is rendering the skips

Actually that empty data is meant to be there so the template can render something if there's a failure.
Template variable tmpl.Builds is created using shownID's length, which are retrieved beforehand getting build data, and then data is appended from a channel. Therefore if we do not push empty values as @MushuEE suggests, template will try to render nil values and crash (I've tested this).

The problem lies in that shownIDs is a list cropped from all the job runs IDs, this slice doesn't take into account that retrieving build data migth fail which is what's happening.

Filtering empty build data in the go template solve the problem of showing empty builds, however as the list to be shown is retrieved beforehand checking if data is correct, we may get history pages with only 1or 2 jobs, or even none, which is not useful at all.

I'd say that the approach here should be to iterate within the complete list of IDs until tmpl.Builds list is filled only with non-empty data.

Was this page helpful?
0 / 5 - 0 ratings