Test-infra: Interval-based Periodic Jobs Triggered Constantly

Created on 6 Feb 2019 · 4Comments · Source: kubernetes/test-infra

We are seeing interval-based periodic jobs triggered at around 5 jobs per second. This is confusing as horologium syncs once per ~15 seconds now that we have ~25k ProwJobs on the cluster:

{"component":"horologium","level":"info","msg":"Sync time: 34.897959336s","time":"2019-02-06T03:51:40Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.457294507s","time":"2019-02-06T03:52:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 16.851891511s","time":"2019-02-06T03:53:22Z"}
{"component":"horologium","level":"info","msg":"Sync time: 18.421960158s","time":"2019-02-06T03:54:24Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.254094248s","time":"2019-02-06T03:55:22Z"}
{"component":"horologium","level":"info","msg":"Sync time: 18.673638916s","time":"2019-02-06T03:56:24Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.401102651s","time":"2019-02-06T03:57:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 18.852934741s","time":"2019-02-06T03:58:24Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.410629213s","time":"2019-02-06T03:59:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 21.902882468s","time":"2019-02-06T04:00:27Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.885607917s","time":"2019-02-06T04:01:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.832846651s","time":"2019-02-06T04:02:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.661795372s","time":"2019-02-06T04:03:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.763673445s","time":"2019-02-06T04:04:23Z"}
{"component":"horologium","level":"info","msg":"Sync time: 17.945887019s","time":"2019-02-06T04:05:23Z"}

However, for one job, here are the creation timestamps on the ProwJobs:

2019-02-05T20:56:37Z
2019-02-05T20:56:37Z
2019-02-05T20:56:37Z
2019-02-05T20:56:37Z
2019-02-05T20:56:38Z
2019-02-05T20:56:38Z
2019-02-05T20:56:38Z
2019-02-05T20:56:38Z
2019-02-05T20:56:39Z
2019-02-05T20:56:39Z
2019-02-05T20:56:39Z
2019-02-05T20:56:39Z
2019-02-05T20:56:40Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:35Z
2019-02-05T20:57:36Z

We need better logging in the component to actually determine what is going on. One thought is that if it's not listing the ProwJobs correctly and never seeing the previous job run it will always trigger a new one.

/area prow/horologium
/kind bug
/cc @fejta @cjwagner @BenTheElder @Katharine @krzyzacy

areprohorologium kinbug

Source

stevekuznetsov

👀1 😕1

All 4 comments

/cc @smarterclayton

stevekuznetsov on 6 Feb 2019

This may have been an issue with our API server and how we had the ProwJob CRD configured on it, and not a bug in horologium

stevekuznetsov on 6 Feb 2019

👀2

This was a bug with how we had the CRD set up and nothing with Prow

/close

stevekuznetsov on 13 Feb 2019

@stevekuznetsov: Closing this issue.

In response to this:

This was a bug with how we had the CRD set up and nothing with Prow

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.