This example (reduced from rustc_codegen_llvm
) either crashes LLVM or makes it emit an assertion:
fn dummy() {}
mod llvm {
pub(crate) struct Foo;
}
mod foo {
pub(crate) struct Foo<T>(T);
impl Foo<::llvm::Foo> {
pub(crate) fn foo() {
for _ in 0..0 {
for _ in &[::dummy()] {
::dummy();
::dummy();
::dummy();
}
}
}
}
pub(crate) fn foo() {
Foo::foo();
Foo::foo();
}
}
pub fn foo() {
foo::foo();
}
I suspect the for
loops aren't needed, but the control-flow there affects the bug in finicky ways.
The llvm
name of the module appears to be important and tied to the fact that we end up emitting llvm::
in a symbol name (because of the impl
), which gets mangled as llvm..
(note the two dots).
Among other things, changing the .
we use to mangle :
, to $
makes the bug go away.
Assertion message (if LLVM assertions are enabled):
rustc: /checkout/src/llvm/lib/Transforms/IPO/FunctionImport.cpp:961:
auto llvm::thinLTOInternalizeModule(llvm::Module &, const llvm::GVSummaryMapTy &)::(anonymous class)::operator()(const llvm::GlobalValue &) const:
Assertion `GS != DefinedGlobals.end()' failed.
Stack backtrace for SIGSEGV (if LLVM assertions are disabled):
#0 0x00007fffed5240d2 in std::_Function_handler<bool (llvm::GlobalValue const&), llvm::thinLTOInternalizeModule(llvm::Module&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&)::$_2>::_M_invoke(std::_Any_data const&, llvm::GlobalValue const&) ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#1 0x00007fffed5409be in llvm::InternalizePass::maybeInternalize(llvm::GlobalValue&, std::set<llvm::Comdat const*, std::less<llvm::Comdat const*>, std::allocator<llvm::Comdat const*> > const&) ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#2 0x00007fffed540e26 in llvm::InternalizePass::internalizeModule(llvm::Module&, llvm::CallGraph*) ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#3 0x00007fffed520079 in llvm::thinLTOInternalizeModule(llvm::Module&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&) ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#4 0x00007fffecda34f8 in LLVMRustPrepareThinLTOInternalize ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#5 0x00007fffeccb1a2c in rustc_codegen_llvm::back::lto::LtoModuleCodegen::optimize ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
#6 0x00007fffecc5fbb2 in rustc_codegen_llvm::back::write::execute_work_item ()
from /home/eddy/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_codegen_llvm-llvm.so
cc @sunfishcode @denismerigoux @alexcrichton @rkruppe
Found the culprit in ModuleSummaryIndex::{getGlobalNameForLocal, getOriginalNameBeforePromote}
:
/// Convenience method for creating a promoted global name
/// for the given value name of a local, and its original module's ID.
static std::string getGlobalNameForLocal(StringRef Name, ModuleHash ModHash) {
SmallString<256> NewName(Name);
NewName += ".llvm.";
NewName += utostr((uint64_t(ModHash[0]) << 32) |
ModHash[1]); // Take the first 64 bits
return NewName.str();
}
/// Helper to obtain the unpromoted name for a global value (or the original
/// name if not promoted).
static StringRef getOriginalNameBeforePromote(StringRef Name) {
std::pair<StringRef, StringRef> Pair = Name.split(".llvm.");
return Pair.first;
}
LLVM implicitly assumes that .llvm.
cannot show up randomly in symbol names, and rustc breaks that assumption - but LLVM never checks it, so it shares part of the blame here, IMO.
Probably the correct thing to do for LLVM is to split the string at the last .llvm.
, instead of the first.
Why are computers
The work around I introduced with the above commit is a proposition of @eddyb and fixes this issue with the segfault.
Sanitizing :
to $
and adding a debug assertion would also be a good idea.
Most helpful comment
Found the culprit in
ModuleSummaryIndex::{getGlobalNameForLocal, getOriginalNameBeforePromote}
:LLVM implicitly assumes that
.llvm.
cannot show up randomly in symbol names, and rustc breaks that assumption - but LLVM never checks it, so it shares part of the blame here, IMO.Probably the correct thing to do for LLVM is to split the string at the last
.llvm.
, instead of the first.