TravisCI recently updated their default image for macOS, which causes the build to hang and time out.
The source of the timeout is coming from gps and it is most likely related to handling external VCS. My initial tests suggest either gps/source_manager_test.go or gps/vcs_source_test.go are responsible when unable to call a VCS. Could be that we're not using requireBins() everywhere we might need it.
I was able to recreate a similar timeout by removing hg from my Mac running 10.13.1 (although this can't be the direct cause in Travis, as the image has both hg and bzr).
So, after digging into today I have some insight (but not much).
Here are some of my attempts to understand this problem with Travis:
Added a call to requireBins() in gps/source_manager_test.go to try and fix my local, no hg timeout.
Still locks.
Decided to turn off parallelism locally and run the test with and without the new calls to requireBins().
It worked. Figured it was parallelism.
Disabled parallelism in Travis for all tests.
Problem still occurs.
Discovered the test causing the problem in Travis (with and without parallelism):
github.com/golang/dep/gps/manager_test.go:939
Disabled the test TestUnreachableSource
Passes for all Xcodes
I ran the test TestUnreachableSource by itself to see if it was the cause:
Locks for Xcode 8.3 and 9.1
So, in Travis, the issue is TestUnreachableSource locking. But locally, I have a completely separate (but similar) locking issue when hg isn't installed. Instead of TestUnreachableSource, the problem comes from TestSourceManager_InferConstraint even with my changes to add explicit bin checking.
I've uploaded the stack traces for both the local and Travis panic here: https://gist.github.com/arbourd/e0d585d997c2a42358cb5a61f73feb14
Both panics have a similar call stack:
gps/source_manager.go:501gps/source.go:205gps/source.go:539gps/source.go:571gps/maybe_source.go:45gps/maybe_source.go:95gps/source_manager.go:657gps/maybe_source.go:97gps/vcs_source.go:266gps/cmd_unix.go:80So, looks like the Cmd waits forever in gps/cmd_unix.go:80. Hmm...
Okay, I think I understand why the commands are hanging.
My local issue appears to be attempting to call:
git ls-remote ssh://[email protected]/golang-dep/dep-test in TestSourceManager_InferConstraint/hg-semver. This would hang forever because of:
$ git ls-remote ssh://[email protected]/golang-dep/dep-test
The authenticity of host 'bitbucket.org (104.192.143.3)' can't be established.
RSA key fingerprint is SHA256:zzXQOXSRBEiUtuE8AikJYKwbHaxvSc0ojez9YXaGp1A.
Are you sure you want to continue connecting (yes/no)?
On Travis its similar: git ls-remote ssh://[email protected]/golang/notexist is being called, and I assume, hanging on SSH protocol stuff.
Looks like we missed a spot on our latest passthrough to turn off interactive prompts: https://github.com/golang/dep/pull/1357.
p.s. Nice sleuthing! 馃憤
The ssh authenticity warning cannot be suppressed, but [email protected] should be added to known_hosts in Travis already. There might be an issue with their newer Mac VMs where they accidentally removed it.
Most helpful comment
Okay, I think I understand why the commands are hanging.
My local issue appears to be attempting to call:
git ls-remote ssh://[email protected]/golang-dep/dep-testinTestSourceManager_InferConstraint/hg-semver. This would hang forever because of:On Travis its similar:
git ls-remote ssh://[email protected]/golang/notexistis being called, and I assume, hanging on SSH protocol stuff.