Openjdk-infrastructure: build-osuosl-centos74-ppc64le-2 fails to fetch from openjdk-build

Created on 18 Sep 2020  路  13Comments  路  Source: AdoptOpenJDK/openjdk-infrastructure

Normally I'd re-run the job and assume the issue to be a one-off, but this has happened twice this week on this specific machine. Two points are a line, not a pattern, but I'm still requesting a machine inspection.

If nothing obvious pops up, can this machine be restarted and refreshed?

04:41:36  ERROR: Error fetching remote repo 'origin'
04:41:36  hudson.plugins.git.GitException: Failed to fetch from https://github.com/AdoptOpenJDK/openjdk-build.git
04:41:36    at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:915)
04:41:36    at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1141)
04:41:36    at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1177)
04:41:36    at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
04:41:36    at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
04:41:36    at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
04:41:36    at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
04:41:36    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
04:41:36    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
04:41:36    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
04:41:36    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
04:41:36    at java.lang.Thread.run(Thread.java:748)
04:41:36  Caused by: hudson.plugins.git.GitException: Command "git clean -fdx" returned status code 143:
04:41:36  stdout: 
04:41:36  stderr: 
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2436)
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2366)
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2362)
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1922)
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1934)
04:41:36    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.clean(CliGitAPIImpl.java:1016)
04:41:36    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
04:41:36    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
04:41:36    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
04:41:36    at java.lang.reflect.Method.invoke(Method.java:498)
04:41:36    at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:931)
04:41:36    at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:905)
04:41:36    at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:857)
04:41:36    at hudson.remoting.UserRequest.perform(UserRequest.java:211)
04:41:36    at hudson.remoting.UserRequest.perform(UserRequest.java:54)
04:41:36    at hudson.remoting.Request$2.run(Request.java:369)
04:41:36    at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
04:41:36    ... 4 more
04:41:36    Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to build-osuosl-centos74-ppc64le-2
04:41:36        at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1788)
04:41:36        at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
04:41:36        at hudson.remoting.Channel.call(Channel.java:998)
04:41:36        at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:285)
04:41:36        at com.sun.proxy.$Proxy381.clean(Unknown Source)
04:41:36        at org.jenkinsci.plugins.gitclient.RemoteGitImpl.clean(RemoteGitImpl.java:455)
04:41:36        at hudson.plugins.git.extensions.impl.CleanBeforeCheckout.decorateFetchCommand(CleanBeforeCheckout.java:45)
04:41:36        at hudson.plugins.git.extensions.GitSCMExtension.decorateFetchCommand(GitSCMExtension.java:288)
04:41:36        at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:911)
04:41:36        at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1141)
04:41:36        at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1177)
04:41:36        at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
04:41:36        at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
04:41:36        at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
04:41:36        at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
04:41:36        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
04:41:36        ... 4 more

https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk/job/jdk-linux-ppc64le-openj9/121/

bug

All 13 comments

Alright, so - had a look - the reason it error'd was due to git clean -fdx taking longer than 10 minutes.
git clean -fdx will remove the untracked files in a git repo (In this instance, it's only jdk-15). I've cleared it manually, however the command still seems to hang (despite git clean with any other combination of the options finishing instantly).
git clean -fdx works on the hotspot equivalent of the git repo, so it may just be a bug with the
jdk-linux-ppc64le-openj9, so I'll remove that.
Reopen if the issue recurs :-)

Thanks @aahlenst :)

So, git clean -fdx is returning 143 again, which is just saying it's timing out which isn't particularly useful. I ran it manually on the machine:

[root@140 jdk15-linux-ppc64le-hotspot]# time git clean -fdx
Removing pipelines/.gradle/
Removing pipelines/target/
Skipping repository workspace/build/src
Removing workspace/build/installedfreetype
Removing workspace/build/cacerts_area
Removing workspace/build/public_key.gpg
Removing workspace/build/installedalsa
Removing workspace/config
Removing workspace/libs
Removing workspace/.gradle
Removing workspace/target

real    8m41.471s
user    0m0.633s
sys 0m1.764s

I also tried this on the jdk11-openj9 repo on the machine, and it did it in 9min 55 seconds. And in this run the -1 machine took 8 min 9 seconds (judging from the timestamps).

For these machines, it appears to just take much longer to run git clean then any other machines- I don't think it's necessarily an issue with the size of the repo (see this s390x JDK15 run where the same command only took 8 seconds), so it could be the network speed of those machines.
So our options are to increase the timeout (which I assume would be in the build repo? ping @M-Davies) or make the machines network faster, somehow...

For now, I've renamed all the JDK15 directories on the machine to <name>-infra1553, as removing the directories last time seemed to sidestep the issue for awhile.

The timeout for the whole checkout is set to 1hour so changing that obviously wouldn't make a difference. Looking at the https://www.jenkins.io/doc/pipeline/steps/workflow-scm-step/ documentation, it appears there's an option to manually set a timeout value in GitSCM -> extensions -> CheckoutOption. I'll have a play and see if I can get a PR in before the nightlies

So I've managed to increase the timeout for checking out, cloning and any submodules in https://github.com/AdoptOpenJDK/openjdk-build/pull/2266 but not the clean option that is causing the error here. This is because the clean module does not have this support yet (see https://issues.jenkins.io/browse/JENKINS-22400):

16:45:41  The recommended git tool is: git
16:45:41  No credentials specified
16:45:42  Fetching changes from the remote Git repository
16:45:57   > git rev-parse --is-inside-work-tree # timeout=10
16:45:58   > git config remote.origin.url https://github.com/AdoptOpenJDK/openjdk-build # timeout=10
16:45:58  Fetching upstream changes from https://github.com/AdoptOpenJDK/openjdk-build
16:45:58   > git --version # timeout=10
16:45:58   > git --version # 'git version 2.28.0'
16:45:58   > git fetch --tags --force --progress -- https://github.com/AdoptOpenJDK/openjdk-build +refs/heads/*:refs/remotes/origin/* # timeout=20
16:45:47  Checking out Revision 488067fd07b741f1f68c8ab706a4a4f9e88ee12b (refs/remotes/origin/master)
16:46:03   > git rev-parse "refs/remotes/origin/master^{commit}" # timeout=10
16:46:03   > git config core.sparsecheckout # timeout=10
16:46:03   > git checkout -f 488067fd07b741f1f68c8ab706a4a4f9e88ee12b # timeout=20
16:45:48  Commit message: "Delete codeql-analysis.yml"
16:45:48  First time build. Skipping changelog.

I think we would have to go with speeding up the network (somehow). I'm going to try and change the downstream job configuration itself instead of the individual checkout but that will probably leave us with the same result, no altered timeout on the clean function

I wouldn't completely dismiss the possibility that it could be disk I/O related instead of network

Is there anything we can do about either disk I/O or Network stuff?

Occurred again with same circumstances => https://github.com/AdoptOpenJDK/openjdk-build/issues/2273

@sxa another way to resolve this would be to increase the git clean -fdx timeout, which is only possible by updating the git default operation timeout globally, setting: -Dorg.jenkinsci.plugins.gitclient.Git.timeOut= on all the Jenkins JVMs

Considering the build PR was merged, and this appears to not have occurred for awhile (and another network/timeout issue that I assume is on the same machine, has been fixed since, by @sxa (https://github.com/AdoptOpenJDK/openjdk-build/issues/2286#issuecomment-738874757)), I'm going to close this :-) Reopen if it recurs or I closed it prematurely

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sxa picture sxa  路  4Comments

lumpfish picture lumpfish  路  4Comments

andrew-m-leonard picture andrew-m-leonard  路  8Comments

Haroon-Khel picture Haroon-Khel  路  8Comments

aahlenst picture aahlenst  路  6Comments