When using TreeArtifacts with cc_library that needs header as also source files, bazel does
not produce correct outputs, which leads to the need to run bazel clean.
Our generator does not necessarily produce stable interfaces (that is a problem on its own which unfortunately lies beyond our reach) resulting in the following scenario. We produce tree artifacts for all off generated sources and interface descriptions. In the generated tree(s) consumer.c needs a function whose actual name is known through interface description provider.h of provider.c (e.g. pre-processor macro). When a generator input changes, there are new provider.c and corresponding provider.h, however consumer.c remains unchanged. When building, bazel considers consumer.c, it (correctly) determines it to be identical and not knowing about its dependency on and effects of provider.h (and provider.c changes) it just takes the "old" object file result for library archiving. provider.c did get updated (and exposes new symbol name discarding the old one) and gets recompiled with its new object file landing in the result. With that, we're set for linker to crash and burn.
bazel build //:myexecmydef.txt to for instance fceAbazel build //:myexecbazel cleanbazel build //:myexecWe have also tried to give bazel a bit clearer understanding of what the sources and headers are in split_headers branch of the same repo, but it has no appreciateble effect on consideration for recompiling individual sources from the incoming tree artifact(s).
Linux (Ubuntu 16.04, Slackware-14.2, Slackare-current).
bazel info release?release 0.15.0
release 0.15.2
release 0.16.0
Only slightly related:
https://github.com/bazelbuild/bazel/issues/5092
Normally I'd say the use case is fairly convoluted and (even without using tree artifacts) such generators and approach to interface stability is asking for trouble. Interestingly enough though, without tree artifacts involved, it seem to actually be able to chew over that mess. I've added "single_files" branch adjusted to that effect into the aforementioned repo.
I'm sorry for the radio silence, I somehow forgot we have a new triaging policy :) I'll take a look next week.
I can repro, definitely a P1. Looking into it.
Root cause - error propagating middlemen artifacts don't get invalidated properly, plus they don't compute their cache key using expanded tree artifacts. Fix in the baking.
This is a nice issue, thanks for reporting :)
Status update, this is still near and dear, I'm currently investigating removal of middleman artifacts completely, and instead teaching skyframe about nested sets.
Are there any news or maybe points where one could support?
Yeah I got distracted and the longer I was away from this the harder it was to come back. I'll reserve this Friday for this and update the issue afterwards. Sorry for this taking forever.
Results after first day: while bazel has a correctness issue exposed by the repro, blaze (pretty much bazel without external repositories with include scanning hooked in that is used internally at google) cannot even build the repro, so I'm in the process of fixing that.
Thanks for the update!
@hlopko I tried to have a look at this again and there is a Pull-Request (https://github.com/bazelbuild/bazel/pull/7336) that would fix the issue described in the repro.
But I'm not sure if the solution might add side-effects I'm not aware of. But since the respective function should only be called for TreeArtifacts, I don't think there are.
I'm back! I'm tired of rediscovering this issue and dark corners of bazel again and again, so I'll dump my brain now :) Sorry for TMI :)
Skyframe (lazy evaluation engine behind Bazel) creates forward and backward edges for all action inputs to be able to properly invalidate when files are changed. This is quadratic in the number of inputs if inputs are repeated in dependant actions (which C++ headers are). Because of that, C++ actions don't depend on headers directly, but through a virtual artifact created by a virtual action. Dependant actions share the same virtual artifact. And only the virtual action depends on all headers. In other words, per header edges are created only in 1 action, not in all dependant actions.
We use this optimization in filegroups, java runtime rule, and C++ rules. It works fine with first two, I believe also when using tree artifacts. These 2 use what we call aggregating middleman artifact. These cannot contain nested aggregating middleman artifacts, and they propagate invalidation. C++ uses error propagating middleman, which can be nested, but doesn't propagate invalidation (and we're getting into the core of the problem).
The C++ story is further complicated by what we call header pruning. To compile a C++ source, we need all transitive headers, so #includes in #includes in #includes work :) We could collect all transitive cc_library headers and make them available to compile action, but it turns out it's wasteful. It's common that many transitive cc library headers are not used. So to prevent useless recompilations of changed but unused headers, we prune unused headers.
Internally we do this pruning before the compile action (so we don't have to upload unused headers to remote execution). In Bazel, we rely on the .d file created during the compilation. After the file is created, Bazel reads the file, discards the middleman artifact, and updates the compile action inputs from the .d file.
So, for incremental builds, action does show quadratic memory growth, but because the number of headers is much smaller than what is declared in the BUILD file, it is fine. I'll have to measure how much waste are we talking about, but that's outside of the problem at hand.
So we see that .d file is essential for incremental correctness. I guess the interested reader will quickly figure out what the root problem of this issue is by looking at https://source.bazel.build/bazel/+/39ffd9780d7484e1de0570f625e9f246c06a90a3:src/main/java/com/google/devtools/build/lib/rules/cpp/CcCompilationHelper.java;l=1378:
private Artifact createCompileActionTemplate(...) {
...
// TODO: Dotd file output is not supported yet.
builder.setOutputs(outputFiles, /* dotdFile= */ null);
...
}
@castler please tell me this worked :) You cannot imagine how happy I am for submitting that fix finally.
@hlopko You cannot imagine how happy I am :D! - I guess I owe you a beer ;)
I tried the master with your fix on the repro and it worked! I will also test it in the more complex environment were the issue was discovered, but I guess that would work fine. If not I will write again.
In any case I am really thankful for submitting the patch and specially for the explanation you gave above, that was really helpful. I also tried to implement this stuff - but I'm just not deep enough within the bazel internals ^^ (yet).