Test-infra: Add a rerun button to Spyglass.

Created on 7 Sep 2019  路  23Comments  路  Source: kubernetes/test-infra

What would you like to be added:

We can now directly rerun ProwJobs from Prow's front page using GitHub oauth. This is feature is access through the rerun dialog that is exposed by clicking on one of the circular arrows on the left side of the page. See for yourself at https://prow.k8s.io

It would be super useful to expose this feature from the spyglass page since that is where job results are viewed and is almost certainly the most common way that devs use Deck.

Why is this needed:

Currently there isn't an easy way to navigate from a spyglass page to the same job run on Decks front page. (This is a problem of its own, but it really only impacts the usability of the rerun button at this point.) Devs need to use the filters on the front page to find matching jobs and even then there may be multiple runs of the job on the same commit that are not distinguishable from the UI.

@Katharine I think you are probably most familiar with this code. Do you think there are any 'gotchas' here regarding the oauth redirection?

cc @stevekuznetsov @fejta @alvaroaleman
/help

areprow help wanted kinfeature

Most helpful comment

I've been advocating for an upload of the full ProwJob YAML (and the Pod to boot) for a long time. I understand why there is this {started,finished}.json compatibility layer for jobs not using pod-utils and other systems like testgrid but I think Prow inside of it's own bubble should be uploading that data to better serve users in deck.

All 23 comments

The other gotcha here is a "_re_run" button requires us to know the ProwJobSpec as resolved at run-time the first time around. We would need to upload those to GCS to get a _re_run button. A _run_ button, though, should be fine.

The other gotcha here is a "_re_run" button requires us to know the ProwJobSpec as resolved at run-time the first time around

Isn't there the clonerefs json that can tell us? We could btw start just uploading the serialized ProwJob to avoid having different data structures on GCS that may or may not contain certain infos.

I've been advocating for an upload of the full ProwJob YAML (and the Pod to boot) for a long time. I understand why there is this {started,finished}.json compatibility layer for jobs not using pod-utils and other systems like testgrid but I think Prow inside of it's own bubble should be uploading that data to better serve users in deck.

The other gotcha here is a "_re_run" button requires us to know the ProwJobSpec as resolved at run-time the first time around. We would need to upload those to GCS to get a _re_run button. A _run_ button, though, should be fine.

I agree that reading the ProwJob spec from GCS would be nice, but we don't need that initially. We can do the same thing we are doing on the front page if the ProwJob resource is still available and just hide the link if it isn't like we do for the ProwJob YAML button on spyglass.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@cjwagner do you still need help on this issue?

Sure, that would be great. We also now upload the ProwJob JSON so we can use that to rerun even after the ProwJob CRD has been GCed.

We should not use the prowjobs stored in GCS for this, for security reasons.

In particular, I would have the following two concerns:

  • Spyglass does not restrict itself to any given sets of buckets or otherwise in GCS. You could trivially point it to a different location and ask it to rerun that job, giving you effectively the ability to run any arbitrary code however you like
  • Jobs have the freedom to upload anything they like to anywhere in the storage bucket. This includes the ability to upload their own prowjobs, which again grants them the ability to run arbitrary prowjobs (and which would therefore effectively constitute privilege escalation).

Both of these issues are resolvable, but not trivially. We could also plausibly restrict what clusters you are permitted to rerun to from GCS, which could help some (at the cost of some convenience).

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/reopen

@matthyx: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/remove-lifecycle rotten

/remove-help
/assign @clarketm
Please /unassign if I'm incorrect that you're planning on tackling

Sure, I can work on this. How do we want to handle the security concerns @Katharine mentioned https://github.com/kubernetes/test-infra/issues/14226#issuecomment-578938142. Would granular ACLs for each repo be enough? For example, in Istio only _trusted_ maintainers are authorized to rerun jobs. Or if the concern is tampering, we could write serialized PJ yaml to a _secure_ bucket/directory for purpose of rerun?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale
What's up with that one?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

I don't think there are any blockers for this. Essentially, the spyglass page should show a re-run button just like the deck front page if we still have the prowjob resource in cluster. The security concerns were about using the prowjob from GCS where I agree that this might be problematic

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

I am not familiar enough with web technologies to make that change...
/remove-lifecycle rotten
/help

@matthyx:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

I am not familiar enough with web technologies to make that change...
/remove-lifecycle rotten
/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings