Qtox: TravisCI times out on OSX compilation

Created on 24 Jan 2019  路  8Comments  路  Source: qTox/qTox

When TravisCI brew installs qt5, qt5 needs to be compiled if it is not cached. TravisCI kills our job because there has been no terminal output for 10 minutes, since brew by default does not show compilation progress, and qt5 takes over 10 minutes to build.

https://github.com/qTox/qTox/pull/5490/files used brew's --verbose option to show compilation progress, which stops TravisCI for killing us due to not output, but now we get killed for outputting _too_ much, with non-stop compilation spam for the duration of the Qt build.

The hackiest solution is probably to have no build output, then output . once every 10s in parallel, to keep us from being killed by TravisCI. It would be nice if there's some way to get some relevant progress without too much spam, or to let TravisCI know that we're going to be outputting nothing for a while and not to worry about it.

P-high

All 8 comments

Not sure, but you can try to pipe --verbose output to pv util. It will count output text size. It you can calculate estimate total output size, you can even use pv -s <size of output> to show process with percent count. If travis will kill it too, you can increase print interval using -i <interval>

For future record, Windows build just also failed due to log being too long: https://travis-ci.org/qTox/qTox/builds/486111882 : https://api.travis-ci.org/v3/job/486111892/log.txt

That sounds like a good idea Diadlo. We should probably check if in travis when making the log suppressing or verbose choice too, so that users' local builds aren't effected.

This doesn't happen anymore.

This is being hit again, e.g. on https://travis-ci.org/qTox/qTox/jobs/593414591. This time we're being killed for _too much_ output, instead of for no output.

I created https://github.com/qTox/qTox/pull/5861 to provide some output, but not too much, but the travis run was still killed when it exceeded the max travis run time of 50 minutes: https://travis-ci.org/qTox/qTox/jobs/593869315

I guess this means that OSX deps need to be split into their own job, or maybe just qt5 itself as a job. There are homebrew users that have qt5 take them over an hour to install: https://stackoverflow.com/questions/17415077/installing-qt5-taking-a-long-time-with-brew, so if even qt5 by itself still is over 50 minutes I'm not sure how we could proceed - I guess compiling individual libraries from qt5. This was working previously on travis though, so probably qt5 by itself in a separate job would be enough of a time savings.

@sudden6 are you familiar with travis? If this is something you could easily do, do you mind taking that over? :grimacing:

sudden6's busy so I dug into this some more. doing a brew update and brew install ffmpeg alone exceeds the 50 minutes timeout. Even with our caching of homebrew packages at $HOME/Library/Caches/Homebrew. https://stackoverflow.com/a/53331571 offers some improved caching options, and $HOME/Library/Caches/Homebrew seems to only include our downloaded packages, not our locally compiled packages, which take the bulk of our dependency time (e.g. compiling cmake alone takes 13 minutes).

I guess we can keep going down the caching root - but even with all our compiled packages cached as well, we need to have enough serial OSX jobs in case we get cache misses (e.g. if dependency versions change), which means at least 2 stages just for ffmpeg, before even starting on other dependencies that could be uncached. This seems kind of crazy, but the alternative of getting good caching and keeping OSX at a single job means it's just a matter of time until we get a cache miss on a dependency and our CI sporadically fails in the future.

Maybe using brew just isn't the right option, or at least using whatever the latest version that's available isn't the right option. If we had known version of dependencies, we could compile each dep exactly once, and provide smarter caching of deps (maybe not though travis?) in a way that wouldn't sporadically fail on version changes (we could provide the binary package of the new version as part of the version change). Having known dep versions would also help with our eventual goal of reproducible builds. Or, maybe as well as known dep versions, we could provide a docker-style image with all the deps pre-installed, which wouldn't have to be generated inside the 50-minute window on provided by travis.

For the purpose of just unblocking our CI, I can't even get that many gains with two serial OSX jobs, because a lot of the work is re-done in the second job due to our caching not including the locally compiled packages.

@nurupo I think you're more familiar with travis than either sudden6 or I.. do you have any insight on this?

The trick is to split dep building in multiple jobs, each running under 50 minutes. That's how I have setup Windows building on Travis.

However if you say that ffmpeg can't compile in 50 minutes on OXS, then I'm not sure what to suggest. You can't just stop the compilation half-way though and continue it in a new job. Are OSX machines really that weak that they can't compile ffmpeg in 50 minutes? We build ffmpeg and maybe 5 or 7 other deps all in a single job in Windows Stage 2. It's surprising that OSX is so much slower.

Maybe it takes so long because brew itself is slow to download things from and also because the ffmpeg formula it has builds ffmpeg with support for everything, when we need just a few specific build flags enabled? You could try avoiding using brew and building it from source with our own flags instead and see if that's fast enough. We will be missing out on any OSX-specific patched brew applies though, but that's probably fine. Depending on how long it takes to build, you can see if it makes sense to split it into multiple jobs.

As a side note, GitHub now has its own CI called GitHub Actions that has 6 hour limit instead of 50 minutes. It's still in beta though and I don't suggest using it, I'm just mentioned that it's a thing now. I'm also not familiar with it, haven't used it for anything, so can't say anything about it except of what I rmember from reading GitHub help pages on it. Also note that ci-release-publisher script we use for publishing nightly builds is written specifically for Travis-CI, so it would need to be re-written for a new CI platform if you want to switch CIs.

Wew, that was a lot of wasted time. Brew stopped supporting macos 10.12 Sierra a couple days ago: https://github.com/Homebrew/brew/pull/6500/files, bumping their minimum officially supported version to 10.13 High Sierra. This caused fewer packages to be available in binary form than before, and updating to a newer macos version for travis fixes this issue. We can now fail because of network problems instead of a timeout, yay!

This is very similar to a problem it looks like we hit around a year ago: https://github.com/qTox/qTox/pull/5405

I should have read the build log more carefully =/

Warning: You are using macOS 10.12.
We (and Apple) do not provide support for this old version.
You will encounter build failures with some formulae.
Please create pull requests instead of asking for help on Homebrew's GitHub,
Discourse, Twitter or IRC. You are responsible for resolving any issues you
experience while you are running this old version.

Upgrading to travis' xcode 9.3 image instead of 9.2 changes the OS from 10.12 to 10.13, giving us binary packages (bottles) for ffmpeg and all of its dependencies, making our mac build speedy once-more. No overhaul to our mac CI system needed. Sorry for bothering both of you :S

Btw, if brew update fails for you due to a network error (it often does for me in my Travis-CI builds)

until brew update; do
  sleep 30
done

should do the trick.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

iphydf picture iphydf  路  3Comments

andriusign picture andriusign  路  4Comments

ghost picture ghost  路  4Comments

ghost picture ghost  路  3Comments

ovalseven8 picture ovalseven8  路  4Comments