Rust: Incremental compilation results in linker error when method use is removed

Created on 29 Mar 2019 · 20Comments · Source: rust-lang/rust

See: https://github.com/jonas-schievink/rubble/compare/incr-linker-error

Build the first commit (jonas-schievink/rubble@04cf1d1), then the second (jonas-schievink/rubble@cedf7be) using cargo build, and you should get a linker error like this:

  = note: rust-lld: error: undefined symbol: _$LT$rubble..ble..time..Duration$u20$as$u20$core..fmt..Display$GT$::fmt::h343296729f83afc4
          >>> referenced by time.rs:129 (src/ble/time.rs:129)
          >>>               /home/jonas/dev/rubble/target/thumbv7em-none-eabi/debug/deps/rubble-a38354bf4561073f.3qtzuenakrd0qcaf.rcgu.o:(_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$::fmt::h26ac05bd6f4f461e)

Note that this needs the thumbv7em-none-eabi target, so you might need to run rustup target add thumbv7em-none-eabi beforehand.

This is on rustc 1.33.0 (2aa4c46cf 2019-02-28).
Also reproduces on rustc 1.35.0-nightly (237bf3244 2019-03-28).

A-incr-comp A-linkage C-bug P-high T-compiler WG-embedded

Source

jonas-schievink

Most helpful comment

I minimized this to ~60 lines in https://github.com/jonas-schievink/rubble/pull/62.

dtolnay on 19 Jun 2019

❤3

All 20 comments

I minimized this to ~60 lines in https://github.com/jonas-schievink/rubble/pull/62.

dtolnay on 19 Jun 2019

❤3

61917 looks like it reproduces this in a non-Cargo environment. Some observations from that issue:

It only happens with incremental builds. If I either remove the incremental DB or disable incremental builds, it works

It only happens at opt-level=1; opt-level=0 and opt-level=3 are fine

It goes away if I make small code changes, but it doesn't seem to matter what (which says incremental)

Changing codegen-units doesn't affect anything.

@jsgf

dtolnay on 19 Jun 2019

pre-triage: P-high; leaving nominated in hopes it increases chance that we discuss it today

pnkfelix on 27 Jun 2019

discussed at T-compiler meeting. Tentatively assigning to @Zoxc for initial investigation.

@Zoxc, if you do not expect to be able to do much investigation, let me know (and unassign yourself)

pnkfelix on 27 Jun 2019

Is this specific to the thumbv7em-none-eabi target?

Zoxc on 27 Jun 2019

Not specific to thumbv7em-none-eabi. @jsgf told me #61917 happened when building for x86_64-unknown-linux-gnu. We don't yet have a reliable repro on x86 ("build this commit followed by this commit") and the project in #61917 is not using Cargo so minimizing from there may be more complicated.

dtolnay on 27 Jun 2019

triage: Reassigning from @Zoxc to self.

pnkfelix on 12 Sep 2019

spent a little while trying to reproduce this today, and failed.

Of note: even just trying to do cargo +nightly-2019-03-29 build atop https://github.com/jonas-schievink/rubble/pull/62, I am getting

error: language item required, but not found: eh_personality

I tried to work around this (by adding my own fake definition for eh_personality), but that just led linking problems for the basic case:

  = note: /usr/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/9/../../../../lib64/Scrt1.o: in function `_start':
          (.text+0x16): undefined reference to `__libc_csu_fini'
          /usr/bin/ld: (.text+0x1d): undefined reference to `__libc_csu_init'
          /usr/bin/ld: (.text+0x2a): undefined reference to `__libc_start_main'
          /usr/bin/ld: /home/pnkfelix/.rustup/toolchains/nightly-2019-03-29-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore\
-ed8b00cc009d913d.rlib(core-ed8b00cc009d913d.core.95lhmtlm-cgu.0.rcgu.o): in function `equal<u8>':
          /rustc/237bf3244fffef501cf37d4bda00e1fce3fcfb46//src/libcore/slice/mod.rs:5074: undefined reference to `memcmp'

(I assume there's some cross-compilation system library support that I'm missing.)

pnkfelix on 29 Oct 2019

Okay, I managed to now replicate it atop @dtolnay 's reduction from jonas-schievink/rubble#62 (I was not installing the thumb target correctly before this point; I was probably not doing it with rustup target add --toolchain or something to synchronize it with my rustc version.)

pnkfelix on 6 Nov 2019

👍1

Here is a repo with my (further-reduced) reduction:
https://github.com/pnkfelix/rust-issue-59535

I got the whole thing down to a chain of three crates: a 13-line crate at the root, a 65-line crate at the leaf, and a .... 1643-line crate in the middle...

Unfortunately the nature of the bug makes it quite difficult to remove any of those 1643 lines
"Fortunately" the majority of the lines (over 1000 of them) are impl R { }. Which is ... interesting...

It still requires the thumbv7em-none-eabi target.

At this point I think I've gotten it at small as I can with my limited knowledge of embedded rust.

So now I'll switch to trying to understand what's going on with the code-generation here.

pnkfelix on 8 Nov 2019

(for my narrowed test case, passing -C codegen-units=1 does seem to make the problem go away, which could be interpreted as contradicting the last bullet in an earlier comment. but given the heisenbug nature of this issue, that could just be masking over the problem rather than showing something fundamental here...)

pnkfelix on 8 Nov 2019

At this point I think I've gotten it at small as I can with my limited knowledge of embedded rust.

Embedded Rust is pretty much regular Rust sans magic done by the operating system and libc in a regular application. The reduced 1.6k middle crate is just an abstraction over fixed bits at fixed memory addresses to talk to built-in peripherals (data busses, timers, I/O pins, radios...) generated from a chip manufacturer provided register description.

Having multiple impl R { }s on the same struct R in the same module sounds quite wrong, is that even allowed? I could imagine that this confuses the heck out of the name mangling.

therealprof on 8 Nov 2019

Having multiple impl R { }s on the same struct R in the same module sounds quite wrong, is that even allowed? I could imagine that this confuses the heck out of the name mangling.

Its absolutely allowed. See:

https://doc.rust-lang.org/stable/book/ch05-03-method-syntax.html#multiple-impl-blocks

pnkfelix on 12 Nov 2019

@pnkfelix Yes, I have learned that last weekend in Barcelona and makes sense to me. Should have updated my comment to not waste your time, sorry about that.

It's still kind of weird that Rust accepts multiple identical impls. That sounds like something which should possibly be at least a lint.

therealprof on 12 Nov 2019

Current status: using -C save-temps and llvm-dis, I have been inspecting the contents of the distinct codegen units that rustc generates.

As a reminder, the undefined symbol is: _$LT$rubble..ble..Duration$u20$as$u20$core..fmt..Display$GT$::fmt::hcaf51ebe05a39b3b, (referenced by mks4f5lxe2i4sqc in /tmp/issue59535/out/rubble2/rubble.mks4f5lxe2i4sqc.rcgu.o). So searching for hcaf51ebe05a39b3b in the disassembled bitcode files (*.ll) seemed like a reasonable step to take.

From there, I discovered that rubble.3bvob4s3e3jl7a31.rcgu.no-opt.ll has a defintion for the aforementioned symbol:

; Function Attrs: nounwind optsize
define internal zeroext i1 @"_ZN60_$LT$rubble..ble..Duration$u20$as$u20$core..fmt..Display$GT$3fmt17hcaf51ebe05a39b3bE"(i32* noalias readonly align 4 dereferenceable(4), %"core::fmt::Formatter"* align 4 dereferenceable(52)) unnamed_addr #1 {
...
}

but rubble.3bvob4s3e3jl7a31.rcgu.ll is missing that definition. So I think that means the definition is getting eliminated by llvm optimization passes.

One thing that stood out to me here: The function definition in the llvm bitcode says define internal. But this definition is referenced by another codegen unit. Surely that should not be internal ? (Maybe this is why the LLVM optimization passes are able to justify removing it?)

Still, seems odd that this only crops up during this combination of factors (namely, the fact that incremental compilation seems to be involved...)

Ah: The first clean run, that didn't have incremental build products, does not mark the function definition as internal; that definition is merely marked hidden

define hidden zeroext i1 @"_ZN60_$LT$rubble..ble..Duration$u20$as$u20$core..fmt..Display$GT$3fmt17hcaf51ebe05a39b3bE"(i32* noalias readonly align 4 dereferenceable(4), %"core::fmt::Formatter"* align 4 dereferenceable(52)) unnamed_addr #1 {
...
}

pnkfelix on 12 Nov 2019

@pnkfelix Yes, I have learned that last weekend in Barcelona and makes sense to me. Should have updated my comment to not waste your time, sorry about that.

It's still kind of weird that Rust accepts multiple identical impls. That sounds like something which should possibly be at least a lint.

Well, multiple identical non-trivial impl blocks is impossible: You get "duplicate definition" errors.

And multiple non-identical impl blocks is useful, e.g. for code-grouping. As mentioned in a very recent tweet:

rust documentation code layout idea: group functions together in separate impl blocks, even if the blocks have the same trait bounds or are in the same file

then apply doc comments to the impl blocks that rustdoc can show to create "sections"
— a quiet misdreavus (@QuietMisdreavus) November 11, 2019

So I think the only case here that one might want to lint for is trivial (i.e. empty) impl blocks. (But even there I would want to exercise care to not lint cases where all the items in the block were removed due to cfg attributes.)

pnkfelix on 12 Nov 2019

More data, after adding some debug! instrumentation:

The module rustc_mir::monomorphize::partitioning has fn internalize_symbols, which tries to mark symbols internal that are only referenced by the current codegen unit.

The first (i.e. clean) run does not mark rubble[317d]::ble[0]::{{impl}}[0]::fmt[0] as internal, because it has these two accessors:

Fn(Instance { def: Item(DefId(0:19 ~ rubble[317d]::ble[0]::{{impl}}[1]::fmt[0])), substs: [] }),
Fn(Instance { def: Item(DefId(0:6 ~ rubble[317d]::ble[0]::link[0]::process_data_packet[0])), substs: [] })

The second run with incremental build produces and an empty process_data_packet definition does mark rubble[317d]::ble[0]::{{impl}}[0]::fmt[0] as internal, while noting that it has this one accessor:

Fn(Instance { def: Item(DefId(0:17 ~ rubble[317d]::ble[0]::{{impl}}[1]::fmt[0])), substs: [] })

The logic of the relevant code is here:

https://github.com/rust-lang/rust/blob/a19f93410d4315408f8775e1be29536302adc223/src/librustc_mir/monomorphize/partitioning.rs#L648-L661

(The debug print outs above do not include the codegen unit placements for the accessors, so I do not actually know for sure how much sense all of this makes. But it certainly makes sense that the aforementioned change to process_data_packet would cause it to be removed from the accessors list, so there is no smoking gun here yet...)

pnkfelix on 12 Nov 2019

More data after adding instrumentation of the codegen units involved: The codegen unit that contains the unresolved reference is named mks4f5lxe2i4sqc. It has a single fn item (one of the fmt fns, specifically core::fmt[0]::{{impl}}[49]::fmt[0]<rubble::ble[0]::Duration[0]>), and that fn item accesses rubble[317d]::ble[0]::{{impl}}[1]::fmt[0].

Note that rubble[317d]::ble[0]::{{impl}}[1]::fmt[0] (which belongs to codegen unit 3bvob4s3e3jl7a31) has previously been found to be an accessor of rubble[317d]::ble[0]::{{impl}}[0]::fmt[0] (whcih belongs to the same codegen unit). The latter item is the item that being marked as Internal and subsequently has its definition optimized away by LLVM.

In short: it looks like we have a call chain of methods: A calls B calls C, A is in one code-gen unit, B and C are in another, and we make the decision to mark C internal (since it is only reachable from B, in the same codegen-unit), but independently, the compiler (presumably LLVM) has inlined the body of B into A, and so we now have a call from A to an internal method (C) in a distinct codegen-unit.

Is there some sort of inlining happening here that implies we need to do a deeper traversal of the accessor-chain before we commit to marking a reachable item as internal?

From looking at the pre- and post-optimized versions of rubble.mks4f5lxe2i4sqc.rcgu.ll, I am thinking that this seems like strong possibility.
Another thing to note: it seems to me like the inlining in question is happening for both the first (clean) build and also for the the second (reusing) build.
- (Is it possible that the second build is reusing the object code for the mks4f5lxe2i4sqc codegen unit from the first build? Is that somehow relevant, or a red herring?)
- In particular: Using the simplified A,B,C model I summarized above, I would have imagined that LLVM could not inline the body of B into A since it is making a call to the internal function C. But maybe the issue is that we are reusing the results of a previous compile where it was legal to inline B into A (because during that compile, C was not marked internal), and independently we are now deciding to recategorize C as internal.
  - Note however that I also would have thought that LLVM could not inline the body of B into A because they belong to distinct codegen units. So there are multiple points of personal confusion here for me to resolve in my theories above.
  - Further update: it seems like the inlining of B into A is happening during link-time optimization (specifically ThinLTO), which does seem like it would jibe with the rules for link-time optimization...

pnkfelix on 12 Nov 2019

👍1

Okay, further investigation: rustc itself controls whether we can reuse the post-ThinLTO version of a module (or if we have to go back and reoptimize it).

And, thank goodness, we already had debug instrumentation in place to emit information about such reuse:

[DEBUG rustc::dep_graph::cgu_reuse_tracker] set_actual_reuse("37it3t5gwe6y9f2", PostLto)
[INFO  rustc_codegen_llvm::back::lto]  - mks4f5lxe2i4sqc: re-used
...
  = note: lld: error: undefined symbol: _$LT$rubble..ble..Duration$u20$as$u20$core..fmt..Display$GT$::fmt::hcaf51ebe05a39b3b
          >>> referenced by mks4f5lxe2i4sqc
          >>>               /tmp/issue59535/out/rubble2/rubble.mks4f5lxe2i4sqc.rcgu.o:(_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$::fmt::he13cc5d4e65d5ab9)

In other words: This lends credence to my theory that we are reusing an artifact from a previous incremental compile that should have been invalidated by the changes to what symbols were marked internal by rustc_mir::monomorphize::partitioning::internalize_symbols

The code that dictates whether a post-ThinLTO object file for a codegen unit can be reused is here:

https://github.com/rust-lang/rust/blob/5e380b797b22e5361a43b2b82f6278df17d89f3e/src/librustc_codegen_llvm/back/lto.rs#L499-L526

I'm going to read over this more carefully and see if I can figure out how to make it aware of the impact of internalize_symbols....

pnkfelix on 14 Nov 2019

@michaelwoerister and I are continuing to investigate this. (zulip archive)

My most recent hypothesis as to what the bug is (transcribed here):

lto.rs, when it decides which modules can be reused in PostLTO form, looks at the import-list and sees if they are all on the green modules list. But it is looking at the import list that it currently gets from LLVM. Should we be instead (or additionally) looking at the old import list, from the time that we did the LTO optimization that we are reusing? (Presumably we would just serialize the ThinLTOImports and re-load that, or add it to the dep-graph maybe.)

Specifically:

https://github.com/rust-lang/rust/blob/0f6f66fcdc4abf110171ee06b1a72bdd2883b74f/src/librustc_codegen_llvm/back/lto.rs#L473-L475

is relying on this:

https://github.com/rust-lang/rust/blob/0f6f66fcdc4abf110171ee06b1a72bdd2883b74f/src/librustc_codegen_llvm/back/lto.rs#L853-L858

which reflects the whatever the current LTO imports are computed, which may differ from the set of imports that were used to drive the original LTO optimization that we are reusing. In particular, they may differ by being a subset of the original set of imports, which means we can overlook hidden dependencies.

pnkfelix on 27 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings