Test-infra: Make cloneref expose baseSHA if not specified in config

Created on 6 Dec 2018  ยท  30Comments  ยท  Source: kubernetes/test-infra

Currently we can have baseRef: master with an empty baseSHA, for example, for periodic jobs specifies an extra_ref points to k/k master.

Cloneref could potentially expose the actual baseSHA and make sidecar aware of the actual SHA so we can use the SHA instead of ref when record revision in finished.json.

/area prow
cc @cjwagner @BenTheElder @stevekuznetsov

areprow

Most helpful comment

$ git ls-remote https://github.com/kubernetes/test-infra master
d3cef13acfc53806ee51906f00f724405ba36187 refs/heads/master

https://git-scm.com/docs/git-ls-remote.html

On Wed, Dec 5, 2018 at 7:55 PM Cole Wagner notifications@github.com wrote:

Periodics that use ExtraRefs are ref-driven periodics. We could make Prow
components populate the baseSHA for ExtraRefs like trigger does for
presubmits' Refs. Currently we determine the baseSHA via the GitHub API
so unless we switch to using the git client, this would require components
like horologium to use a GitHub token ๐Ÿ‘Ž .
Is there a git command to look up the SHA for a remote refname without
cloning the repo?

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/10359#issuecomment-444739970,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq6goDJY9aY-gDNZwwwPOic6gKiYpks5u2JUcgaJpZM4ZFmts
.

All 30 comments

Nooooo!!!! We want a new type of job -- a ref-driven periodic. We should stop having two types of ref jobs. It breaks all sorts of mental models when half of the refs are declarative from spec and half are resolved at run-time -.-

We probably want one type of job instead of more types of jobs :joy:

Periodics that use ExtraRefs are ref-driven periodics. We could make Prow components populate the baseSHA for ExtraRefs like trigger does for presubmits' Refs. Currently we determine the baseSHA via the GitHub API so unless we switch to using the git client, this would require components like horologium to use a GitHub token ๐Ÿ‘Ž .
Is there a git command to look up the SHA for a remote refname without cloning the repo?

$ git ls-remote https://github.com/kubernetes/test-infra master
d3cef13acfc53806ee51906f00f724405ba36187 refs/heads/master

https://git-scm.com/docs/git-ls-remote.html

On Wed, Dec 5, 2018 at 7:55 PM Cole Wagner notifications@github.com wrote:

Periodics that use ExtraRefs are ref-driven periodics. We could make Prow
components populate the baseSHA for ExtraRefs like trigger does for
presubmits' Refs. Currently we determine the baseSHA via the GitHub API
so unless we switch to using the git client, this would require components
like horologium to use a GitHub token ๐Ÿ‘Ž .
Is there a git command to look up the SHA for a remote refname without
cloning the repo?

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/10359#issuecomment-444739970,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq6goDJY9aY-gDNZwwwPOic6gKiYpks5u2JUcgaJpZM4ZFmts
.

good to know!

we probably also need to handle private repos on top of that.

/assign
will poke tomorrow

@cjwagner

Periodics that use ExtraRefs are ref-driven periodics.

Not technically true. They're driven but not at the specific commit level. The ref is still resolved at run-time tomorrow.

hummm, after a closer look, it's too late to resolve the remote refs in sidecar if we use ls-remote... wonder if there's some data can be passed along from cloneref itself instead?

We should implement a new type of job. We should not hack this backwards.

@stevekuznetsov ExtraRefs allows a specific commit to be specified. It is just a slice of Refs structs that each can provide a baseSHA: https://github.com/kubernetes/test-infra/blob/27184a2e3c70c1ba43d7a65e93f29583303c70e2/prow/apis/prowjobs/v1/types.go#L453

We just don't have logic in horologium to resolve the SHA before creating the job.
I'm hesitant to create a new whole new type of job for this because it seems like we could get the same behavior by just adding an option to have the field populated by Prow when the job is created.
It might make sense to have Prow always provide a SHA for every ExtraRef that doesn't specify a SHA (for all job types). Is there any benefit to resolving the baseSHA from the job itself?

It might make sense to have Prow always provide a SHA for every ExtraRef that doesn't specify a SHA

in pjutils? that's the only unified place I can think of..

so, just pre-check: do all of the controller images have git?

No, all controllers that create jobs now need to use the git-base image like hook and tide do: https://github.com/kubernetes/test-infra/blob/43dc577de75e9f506d4673d76c6d2002868a10b3/prow/cmd/hook/BUILD.bazel#L8

Lets get @stevekuznetsov's thoughts before we go ahead with this though.

yeah, implementation itself sounds pretty straight forward.

I meant that we don't set SHA today, not that we can't. I agree that getting horologium to resolve it would be good, but I worry about downstream logic that needs to do:

var refs prow.Refs
switch job.Type:
case prow.Presubmit:
    refs = job.Refs
case prow.Periodic:
    refs = job.ExtraRefs[0]

Seems like we could do better and have a type of job that _always_ has job.Refs set and runs periodically. We can keep periodic jobs periodic if they don't touch repos like commenter etc. Using the test-infra git client inside of horologium to resolve the SHA SGTM as long as you don't find that breaking to the API for periodics.

If it's breaking it might be safer to create a new job type

I don't know of any jobs that this would break and I would think it is ok as we are just populating a previously empty field. I don't think we have anything checking that today.
I agree that we should minimize the kind of job logic you detailed above, but making a new job type wouldn't get rid of the switch statement you shared, it would just move the check to a separate case for the new job type?

I think the ideal path forward would be to change the ProwJobs Refs field to a slice of Refs and combine it with ExtraRefs so that jobs don't need to consider both. Distinguishing the implicit refs from the extra refs probably isn't useful for jobs and it just makes it more likely that a job will handle only Refs and accidentally ignore ExtraRefs. That would be a breaking change to both the downward API and the ProwJob CRD though so until we version or CRDs this isn't even an option.

No, wouldn't we just always honor Refs if they are present?

Always having a slice would also work

Ah I see what you are saying. I still don't really like that pattern because we still have two fields that define refs. Some logic to handle both Refs and ExtraRefs will be needed as long as both fields exist so the switch statement might disappear, but we'll still need to consider both anyways.
Since it probably isn't feasible to switch the CRD Refs field to be a slice of all the refs right now, all jobs should really be starting with something like this if they need to know what refs were cloned:

var allRefs []kube.Refs
if spec.Refs != nil {
  allRefs = append(allRefs, *spec.Refs)
}
allRefs = append(allRefs, spec.ExtraRefs...)

In any case it sounds like we can go forward with having Prow controllers populate the ExtraRefs' baseSHAs if they are unspecified? I think we should probably be doing that either way and that will unblock Sen. We can watch to see if any periodics are affected in case some job is doing something strange.

cc @BenTheElder

We need to solve this as we are starting to migrate verify/integration job onto podutils - technically we don't care extra_refs for presubmits (the short term goal is to show the commit number to testgrid), we can just let horologium to solve it for now. add the key to clone-record.json and let testgrid reads that

So now we are moving towards backfill the data into started/finished.json now, @stevekuznetsov wonder if the downstream spec is mutable when we are cloning? Or otherwise I'm thinking of we can make initupload understand clone-records.json?

wonder if the downstream spec is mutable when we are cloning?

I don't think we should mutate the downstream spec from the job, even during the initial cloning step. If the downstream spec needs additional data I think it should be populated before the pod is scheduled.

I don't think we should mutate the downstream spec from the job, even during the initial cloning step. If the downstream spec needs additional data I think it should be populated before the pod is scheduled.

Unless we want to resolve master -> some sha within prow, that's unlikely. It can be resolved during cloning though.

Unless we want to resolve master -> some sha within prow, that's unlikely. It can be resolved during cloning though.

Yeah, I'm saying if we need the downstream spec to include the SHA then all Prow components that trigger jobs should resolve the SHA when creating the job. That sounds like a potentially annoying constraint to add to PJ creation though.
It is unclear to me why the downstream spec needs to be updated at all though. Having initupload determine the SHA itself or from the clone-records.json seems preferable if we don't need the downstream spec to include the SHA for some other reason.

yeah I can add some logic to make inituploader to understand fields from clone-records.json

Yeah, I'm saying if we need the downstream spec to include the SHA then all Prow components that trigger jobs should resolve the SHA when creating the job. That sounds like a potentially annoying constraint to add to PJ creation though.
It is unclear to me why the downstream spec needs to be updated at all though. Having initupload determine the SHA itself or from the clone-records.json seems preferable if we don't need the downstream spec to include the SHA for some other reason.

๐Ÿ’ฏ
which object it is in isn't important, putting it in the pj spec is probably not a good avenue though, and prow probably shouldn't be trying to resolve these.
resolving it at clone time and uploading it in clone-records.json SGTM

๐Ÿคž
Our metadata should be good to go now - just need to have testgrid/spyglass/gubernator treat them properly

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sjenning picture sjenning  ยท  4Comments

zacharysarah picture zacharysarah  ยท  3Comments

BenTheElder picture BenTheElder  ยท  4Comments

spiffxp picture spiffxp  ยท  3Comments

xiangpengzhao picture xiangpengzhao  ยท  3Comments