Nixpkgs: chromium: do we still need to build it on half of available cores because of a 3 years old Hydra issue?

Created on 14 Oct 2020  路  8Comments  路  Source: NixOS/nixpkgs

Do we still need to build chromium on half of available cores because of a 3 years old Hydra issue?
Shouldn't it be fixed in Hydra (by adding "critical path" semaphores. Or, at least, by memory upgrade or by increasing swap space) ?
Or maybe already fixed?

https://github.com/NixOS/nixpkgs/blob/e24a4b950cea82b7c9ae6aaadaf272535d7bf63c/pkgs/applications/networking/browsers/chromium/common.nix#L323-L331

@vcunat

bug

All 8 comments

Perhaps. I don't know from memory how it's set up there right now and I don't have admin access. I see it takes two hours (and a couple minutes).

On a laptop it takes 6 hours at 50% CPU load

There was a period where a low core count (half of two, i.e., a single core) was causing the aarch64 build to time out after two days (related #78347), but it seems ok now. I think the problem was solved by using big-parallel builders with a higher core allocation per job.

This is also mildly annoying for me personally. I test the aarch64-build on chromium update PRs[1] and always run the build by specifying double my cores via --option cores 32.

[1] though I admit I've missed a few recently due to hardware troubles.

Is there a meta tracking / issue for the hydra build timeout problem somewhere?
I don't know much about the internals of hydra, but from the comment it looks like the timeout is wall clock time rather than cpu time, which it probably should not be (or maybe a combination of both)?

Is there a meta tracking / issue for the hydra build timeout problem somewhere?

Not sure. But this (meta.timeout) is basically a "feature" (https://nixos.org/manual/nixpkgs/unstable/#sec-standard-meta-attributes):

A timeout (in seconds) for building the derivation. If the derivation takes longer than this time to build, it can fail due to breaking the timeout. However, all computers do not have the same computing power, hence some builders may decide to apply a multiplicative factor to this value. When filling this value in, try to keep it approximately consistent with other values already present in nixpkgs.

Not sure why we use it though (maybe to notice regressions in build times, avoid prolonged channel delays, or cancel builds running on slow hosts (though requiredSystemFeatures = [ "big-parallel" ] should already solve that)).
Relevant commits for Chromium: 537d14f4e2c8ef908641223bc80fe8e2bca74e90 and 7679891e2b41f13cfcb3692e9e26d5f9cfa4edb2.

But if there isn't already an issue for this it might indeed be a good idea to open one to e.g. discover its origins and document when and why we use meta.timeout.

Ok, so from #39476 (linked from #39570) it looks like this was indeed done to avoid slow Chromium builds blocking a channel. But since Chromium is part of the release critical job (tested) that shouldn't work AFAIK as the channel update wouldn't succeed if the Chromium build fails (assuming a timeout also causes tested to fail).

No, the timeout is there by default and we make it higher for chromium so that its build has enough time to finish.

Generally a build can get stuck (in a loop or something). In NixOS tests it's not too rare.

No, the timeout is there by default and we make it higher for chromium so that its build has enough time to finish.

Oh :facepalm:, that makes way more sense :D I was quite confused while glancing over #39476... Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

copumpkin picture copumpkin  路  3Comments

chris-martin picture chris-martin  路  3Comments

sid-kap picture sid-kap  路  3Comments

ob7 picture ob7  路  3Comments

ghost picture ghost  路  3Comments