This is an umbrella issue of problems that arise from using build tools that have their own internal parallelism.
In this Google Groups thread, @jmmv asked to file an issue about this:
https://groups.google.com/d/msg/bazel-discuss/_oHaU50P5Rg/imx5Y49MAwAJ
A little context: swiftc is the swift compiler driver. It's a non-traditional compiler, it doesn't build one source file at time, it builds one module of N source files at time. swiftc spawns "swift frontend" invocations, and the number of spawned processes is very often >1.
There are two related problems:
In the first case, it would be good if the action API could express to Bazel how much parallelism is used by an action. This avoids the problem of N bazel actions each running some M sub-actions each.
In the second case, it would be good if the action API could express a range of parallelism an action is capable of using. This would really help the performance of bottleneck actions in the critical path. For example, Bazel could see that it's not using its full amount of jobs, and donate the extra parallelism to the bottleneck action. We see this as particularly useful at the tail end of builds, where there are fewer targets left to build. This problem shows up even more in incremental builds, where the action graph is often much more flat, even linear.
As @allevato pointed out in the google groups thread, this would require some way for actions to pass args that are known not to affect output, such as a -j<N> flag. This would also need to preserve the cache keys.
This feature allows us to avoid two current problems:
The first issue can happen with any swift module over 25 files. The default batching logic creates one swift frontend for each group of 25 files. A swift module with 100 files will spawn 4 sub-actions, unbeknownst to Bazel.
As mentioned, the second case is something that causes slowdowns for incremental development builds.
If needed, I can make a rules_swift project that demonstrates the issue.
We see the problem in our build by looking at --experimental_generate_json_trace_profile and by comparing to Xcode's builds, which can sometimes be faster due to its seemingly hard code use of -j8.
macOS
bazel info release?release 1.2.0
As mentioned above, a small amount of discussion happened on Google Groups:
https://groups.google.com/d/msg/bazel-discuss/_oHaU50P5Rg/imx5Y49MAwAJ
I've also posted a general (non-bazel) question to the Swift Forums.
https://forums.swift.org/t/globally-optimized-build-parallelism/31802/2
Thanks for creating this issue!
Since @jmmv specifically asked for it to be created I am removing it from the untriaged label and giving it an initial P2 priority
Just one more addition since I just closed #11275: if we do this and explicitly tell an action that we should use X threads, we also have to go the other way and ensuring the action doesn't use more than X threads (1 in the general case!) when told so.
we also have to go the other way and ensuring the action doesn't use more than X threads (1 in the general case!) when told so.
I don't see this as a requirement, at least not until someone provides a use case. We have a local patch that lets us set concurrency and we have cases where an action may peek briefly at 4 threads but empirically has a steady state of 2 threads, for example. A multi-threaded process does not usually consume cores in 100% core increments.
Consider xz -T0, which uses all available cores. Assume:
genrule(
name = name+"-xz",
srcs = [name],
outs = [name + ".xz"],
cmd = "xz -T0 -f $< > $@",
)
I would like to tell Bazel that "this rule uses N-1 workers, where N is the number of available cores"; I would still like to leave 1-2 slots, in case they are IO bound.
Most helpful comment
Thanks for creating this issue!
Since @jmmv specifically asked for it to be created I am removing it from the untriaged label and giving it an initial P2 priority