Test-infra: Tide fails to merge batch

Created on 25 Oct 2017  路  5Comments  路  Source: kubernetes/test-infra

Tide failed to merge a batch because the base branch was modified.
Here are the logs for the sync loop of interest and the loops preceeding and following that loop:

{"controller":"tide","level":"info","msg":"Building tide pool.","time":"2017-10-25T20:17:34Z"}
{"controller":"tide","level":"info","msg":"Search for query \"type:pr state:open repo:kubernetes/test-infra label:lgtm label:approved -label:needs-ok-to-test -label:do-not-merge/work-in-progress -label:do-not-merge/hold label:\"cncf-cla: yes\"\" cost 1 point(s). 4944 remaining.","time":"2017-10-25T20:17:35Z"}
{"controller":"tide","level":"info","msg":"kubernetes/test-infra master: 5 PRs, 12 PJs.","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Passing PRs: [5168]","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Pending PRs: []","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Missing PRs: [5169 5147 5010 3883]","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Passing batch: []","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Pending batch: true","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Action: WAIT, Targets: []","time":"2017-10-25T20:17:41Z"}
{"level":"info","msg":"Sync time: 6.118784868s","time":"2017-10-25T20:17:41Z"}
{"controller":"tide","level":"info","msg":"Building tide pool.","time":"2017-10-25T20:18:34Z"}
{"controller":"tide","level":"info","msg":"Search for query \"type:pr state:open repo:kubernetes/test-infra label:lgtm label:approved -label:needs-ok-to-test -label:do-not-merge/work-in-progress -label:do-not-merge/hold label:\"cncf-cla: yes\"\" cost 1 point(s). 4943 remaining.","time":"2017-10-25T20:18:35Z"}
{"controller":"tide","level":"info","msg":"kubernetes/test-infra master: 5 PRs, 12 PJs.","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","level":"info","msg":"Passing PRs: [5168]","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","level":"info","msg":"Pending PRs: []","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","level":"info","msg":"Missing PRs: [5169 5147 5010 3883]","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","level":"info","msg":"Passing batch: [5169 5147 5010]","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","level":"info","msg":"Pending batch: false","time":"2017-10-25T20:18:41Z"}
{"controller":"tide","error":"Base branch was modified. Review and try the merge again.","level":"warning","msg":"Merge failed: PR is unmergable. How did it pass tests?!","time":"2017-10-25T20:18:43Z"}
{"controller":"tide","error":"Base branch was modified. Review and try the merge again.","level":"warning","msg":"Merge failed: PR is unmergable. How did it pass tests?!","time":"2017-10-25T20:18:43Z"}
{"controller":"tide","level":"info","msg":"Action: MERGE_BATCH, Targets: [{5169 {jlewi} {master refs/heads/} {test-infra kubernetes/test-infra {kubernetes}} {{5a29a5f4e6e83a1a35218cbcc193ccdc82ed3086}} {[{{{SUCCESS}}}]}} {5147 {marun} {master refs/heads/} {test-infra kubernetes/test-infra {kubernetes}} {{e11ff2305cb472084ec3b20f35625a5e363ff282}} {[{{{SUCCESS}}}]}} {5010 {kargakis} {master refs/heads/} {test-infra kubernetes/test-infra {kubernetes}} {{248c288ae0f0f26fe831f5a3637a8fb5953b7058}} {[{{{SUCCESS}}}]}}]","time":"2017-10-25T20:18:43Z"}
{"level":"info","msg":"Sync time: 8.281722494s","time":"2017-10-25T20:18:43Z"}
{"controller":"tide","level":"info","msg":"Building tide pool.","time":"2017-10-25T20:19:34Z"}
{"controller":"tide","level":"info","msg":"Search for query \"type:pr state:open repo:kubernetes/test-infra label:lgtm label:approved -label:needs-ok-to-test -label:do-not-merge/work-in-progress -label:do-not-merge/hold label:\"cncf-cla: yes\"\" cost 1 point(s). 4942 remaining.","time":"2017-10-25T20:19:35Z"}
{"controller":"tide","level":"info","msg":"kubernetes/test-infra master: 4 PRs, 0 PJs.","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Passing PRs: []","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Pending PRs: []","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Missing PRs: [5168 5147 5010 3883]","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Passing batch: []","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Pending batch: false","time":"2017-10-25T20:19:40Z"}
{"controller":"tide","level":"info","msg":"Action: TRIGGER, Targets: [{5010 {kargakis} {master refs/heads/} {test-infra kubernetes/test-infra {kubernetes}} {{248c288ae0f0f26fe831f5a3637a8fb5953b7058}} {[{{{SUCCESS}}}]}}]","time":"2017-10-25T20:19:40Z"}
{"level":"info","msg":"Sync time: 5.85054185s","time":"2017-10-25T20:19:40Z"}

cc @spxtr
/assign

areprow kinbug

Most helpful comment

Mungegithub doesn't need to sleep to be slow :smile:
Before mergePullRequest returns it creates a comment, lists commits, creates a status and sometimes does more so that could be enough delay.

I also just noticed this though:
https://github.com/kubernetes/test-infra/blob/9d302ebf7e2c298f48437df3e6e8ee151f28953a/mungegithub/github/github.go#L2122-L2145
given that we don't have this problem with the submit queue I think the comment is correct. We probably just need to add this behavior to Tide. I already have an IsMergeable function written for Prow as part of the needs-rebase plugin so I can add that to Tide's merge loop.

All 5 comments

The batch was running [5169, 5147, 5010].
After merging the first PR successfully (5169), tide failed to merge the remaining two.
5010 was started as the next serial PR and passed tests without the author updating the PR which suggests that there was no conflict between 5169 and 5010.

It looks like this error has show up before: https://github.com/kubernetes/contrib/issues/140

Its strange that github complains that the base branch was modified because the API doesn't even allow the request to specify the base branch sha, only the head sha.

Based on all this I'm guessing that github is complaining because we are making the merge requests too close together and it cannot recompute mergability in time. The fix might be to distinguish "modified head" errors from "modified base" errors and retry after a short delay in the case of "modified base" errors (currently both errors have the type ModifiedHeadError).
I'm not sure how to test to see if I'm right about this since its appears to be a race condition.
WDYT? @spxtr @kargakis

This just struck again us again on a batch for #6019 and #6041 that should have merged at 12:20 PM PST today.

Whoops, didn't see this. A single retry on a short timer seems reasonable, and it also probably wouldn't hurt to add a short sleep in this loop. I think mungegithub does that, actually. @rmmh might know.

Mungegithub doesn't need to sleep to be slow :smile:
Before mergePullRequest returns it creates a comment, lists commits, creates a status and sometimes does more so that could be enough delay.

I also just noticed this though:
https://github.com/kubernetes/test-infra/blob/9d302ebf7e2c298f48437df3e6e8ee151f28953a/mungegithub/github/github.go#L2122-L2145
given that we don't have this problem with the submit queue I think the comment is correct. We probably just need to add this behavior to Tide. I already have an IsMergeable function written for Prow as part of the needs-rebase plugin so I can add that to Tide's merge loop.

Was this page helpful?
0 / 5 - 0 ratings