On nightly channel, having RUST_BACKTRACE=1 or RUST_BACKTRACE=full when running cargo test throws this error on any panic:
error: process didn't exit successfully: `/.../target/debug/deps/test_a-...`
(signal: 10, SIGBUS: access to undefined memory)
#[test]
fn test_1() {
assert!(false);
}
I have also been able to reproduce this. The same code got a SIGBUS on Mac OS, but the backtrace was printed as expected on Linux. Perhaps related to https://github.com/rust-lang/rust/pull/44251?
Stack trace of the SIGBUS:
* thread #2, name = 'test_1', stop reason = EXC_BAD_ACCESS (code=10, address=0x1001ef000)
* frame #0: 0x0000000100055201 a-cfbe649daab6d4af`macho_get_commands + 145
frame #1: 0x00000001000564cc a-cfbe649daab6d4af`macho_try_dwarf + 396
frame #2: 0x0000000100056f86 a-cfbe649daab6d4af`macho_try_dsym + 726
frame #3: 0x0000000100057519 a-cfbe649daab6d4af`macho_add + 1225
frame #4: 0x0000000100057707 a-cfbe649daab6d4af`backtrace_initialize + 263
frame #5: 0x0000000100054a9e a-cfbe649daab6d4af`fileline_initialize + 558
frame #6: 0x0000000100054b7d a-cfbe649daab6d4af`backtrace_syminfo + 45
frame #7: 0x000000010003fd52 a-cfbe649daab6d4af`core::slice::{{impl}}::position::{{closure}}<std::sys_common::backtrace::Frame,closure> [inlined] std::sys_common::gnu::libbacktrace::resolve_symname<closure> at libbacktrace.rs:79 [opt]
frame #8: 0x000000010003fd1f a-cfbe649daab6d4af`core::slice::{{impl}}::position::{{closure}}<std::sys_common::backtrace::Frame,closure> [inlined] std::sys::imp::backtrace::printing::resolve_symname<closure> at mod.rs:36 [opt]
frame #9: 0x000000010003fd1f a-cfbe649daab6d4af`core::slice::{{impl}}::position::{{closure}}<std::sys_common::backtrace::Frame,closure> [inlined] std::sys_common::backtrace::filter_frames::{{closure}} at backtrace.rs:108 [opt]
frame #10: 0x000000010003fd18 a-cfbe649daab6d4af`core::slice::{{impl}}::position::{{closure}}<std::sys_common::backtrace::Frame,closure> at mod.rs:1204 [opt]
frame #11: 0x0000000100047df1 a-cfbe649daab6d4af`std::sys_common::backtrace::_print [inlined] core::slice::{{impl}}::search_while<std::sys_common::backtrace::Frame,core::option::Option<usize>,closure> at mod.rs:1271 [opt]
frame #12: 0x0000000100047daf a-cfbe649daab6d4af`std::sys_common::backtrace::_print [inlined] core::slice::{{impl}}::position<std::sys_common::backtrace::Frame,closure> at mod.rs:1203 [opt]
frame #13: 0x0000000100047da3 a-cfbe649daab6d4af`std::sys_common::backtrace::_print [inlined] std::sys_common::backtrace::filter_frames at backtrace.rs:106 [opt]
frame #14: 0x0000000100047d7f a-cfbe649daab6d4af`std::sys_common::backtrace::_print at backtrace.rs:71 [opt]
frame #15: 0x000000010004d224 a-cfbe649daab6d4af`std::panicking::default_hook::{{closure}} [inlined] std::sys_common::backtrace::print at backtrace.rs:58 [opt]
frame #16: 0x000000010004d203 a-cfbe649daab6d4af`std::panicking::default_hook::{{closure}} at panicking.rs:381 [opt]
frame #17: 0x000000010004cd73 a-cfbe649daab6d4af`std::panicking::default_hook at panicking.rs:391 [opt]
frame #18: 0x000000010004d703 a-cfbe649daab6d4af`std::panicking::rust_panic_with_hook at panicking.rs:577 [opt]
frame #19: 0x0000000100000ffe a-cfbe649daab6d4af`std::panicking::begin_panic::hcb69775319227ce6 + 110
frame #20: 0x0000000100000f26 a-cfbe649daab6d4af`a::test_1::h499e1b269bece65b + 38
frame #21: 0x0000000100014a62 a-cfbe649daab6d4af`test::{{impl}}::call_box<(),closure> [inlined] test::run_test::{{closure}} at lib.rs:1480 [opt]
frame #22: 0x0000000100014a5d a-cfbe649daab6d4af`test::{{impl}}::call_box<(),closure> [inlined] core::ops::function::FnOnce::call_once<closure,(())> at function.rs:223 [opt]
frame #23: 0x0000000100014a5d a-cfbe649daab6d4af`test::{{impl}}::call_box<(),closure> at lib.rs:141 [opt]
frame #24: 0x000000010005876f a-cfbe649daab6d4af`panic_unwind::__rust_maybe_catch_panic at lib.rs:99 [opt]
frame #25: 0x0000000100006ce1 a-cfbe649daab6d4af`std::sys_common::backtrace::__rust_begin_short_backtrace<closure,()> [inlined] std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> at panicking.rs:459 [opt]
frame #26: 0x0000000100006c9c a-cfbe649daab6d4af`std::sys_common::backtrace::__rust_begin_short_backtrace<closure,()> [inlined] std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> at panic.rs:361 [opt]
frame #27: 0x0000000100006c9c a-cfbe649daab6d4af`std::sys_common::backtrace::__rust_begin_short_backtrace<closure,()> [inlined] test::run_test::run_test_inner::{{closure}} at lib.rs:1419 [opt]
frame #28: 0x0000000100006b49 a-cfbe649daab6d4af`std::sys_common::backtrace::__rust_begin_short_backtrace<closure,()> at backtrace.rs:134 [opt]
frame #29: 0x000000010000f508 a-cfbe649daab6d4af`std::panicking::try::do_call<std::panic::AssertUnwindSafe<closure>,()> [inlined] std::thread::{{impl}}::spawn::{{closure}}::{{closure}}<closure,()> at mod.rs:402 [opt]
frame #30: 0x000000010000f4f2 a-cfbe649daab6d4af`std::panicking::try::do_call<std::panic::AssertUnwindSafe<closure>,()> [inlined] std::panic::{{impl}}::call_once<(),closure> at panic.rs:296 [opt]
frame #31: 0x000000010000f4f2 a-cfbe649daab6d4af`std::panicking::try::do_call<std::panic::AssertUnwindSafe<closure>,()> at panicking.rs:480 [opt]
frame #32: 0x000000010005876f a-cfbe649daab6d4af`panic_unwind::__rust_maybe_catch_panic at lib.rs:99 [opt]
frame #33: 0x00000001000209c2 a-cfbe649daab6d4af`alloc::boxed::{{impl}}::call_box<(),closure> [inlined] std::panicking::try<(),std::panic::AssertUnwindSafe<closure>> at panicking.rs:459 [opt]
frame #34: 0x0000000100020989 a-cfbe649daab6d4af`alloc::boxed::{{impl}}::call_box<(),closure> [inlined] std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()> at panic.rs:361 [opt]
frame #35: 0x0000000100020989 a-cfbe649daab6d4af`alloc::boxed::{{impl}}::call_box<(),closure> [inlined] std::thread::{{impl}}::spawn::{{closure}}<closure,()> at mod.rs:401 [opt]
frame #36: 0x000000010002093c a-cfbe649daab6d4af`alloc::boxed::{{impl}}::call_box<(),closure> at boxed.rs:762 [opt]
frame #37: 0x000000010004c83c a-cfbe649daab6d4af`std::sys::imp::thread::{{impl}}::new::thread_start [inlined] alloc::boxed::{{impl}}::call_once<(),()> at boxed.rs:772 [opt]
frame #38: 0x000000010004c839 a-cfbe649daab6d4af`std::sys::imp::thread::{{impl}}::new::thread_start [inlined] std::sys_common::thread::start_thread at thread.rs:24 [opt]
frame #39: 0x000000010004c7be a-cfbe649daab6d4af`std::sys::imp::thread::{{impl}}::new::thread_start at thread.rs:90 [opt]
frame #40: 0x00007fff6d23e6c1 libsystem_pthread.dylib`_pthread_body + 340
frame #41: 0x00007fff6d23e56d libsystem_pthread.dylib`_pthread_start + 377
frame #42: 0x00007fff6d23dc5d libsystem_pthread.dylib`thread_start + 13
No repro when using a custom stage1 instead of nightly 馃槙
@JohnColanduoni Any idea?
Repro steps:
rustc --test -g 1.rsRUST_BACKTRACE=1 ./1I suspect it has something to do with threading.
This bug is no-repro on beta, so #44251 is not the sole cause of the SIGBUS. Maybe we could first rust-bisect it.
Regression was introduced in this range -- https://github.com/rust-lang/rust/compare/4279e2b4c...8493813cd
Between nightly-2017-10-22 and nightly-2017-10-23
So this implies to me two things:
I'm going to send a PR to disable libbacktrace on OSX due to the latter point.
@kennytm What makes you suspect threading? IIRC libbacktrace functions are protected on the Rust side with a global lock.
@JohnColanduoni because I can't repro it if I used #[bench] (everything runs in the main thread) instead of #[test] (the code is run in worker thread).
That said, the following isn't causing SIGBUS either:
fn main() {
std::thread::spawn(|| panic!()).join().unwrap();
}
This is definitely related to loading libtest in some way, I'm able to reproduce with the following:
extern crate libc;
use std::{env};
use std::ffi::CString;
fn main() {
env::set_var("RUST_BACKTRACE", "1");
let libtest_path = CString::new("/Users/john/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/libtest-bbaa058d367874e2.dylib").unwrap();
let handle = unsafe {
libc::dlopen(libtest_path.as_ptr(), libc::RTLD_NOW | libc::RTLD_GLOBAL)
};
if handle.is_null() {
panic!("failed to open libtest");
}
assert!(false);
}
It causes the appropriate crash on both nightly and beta (as long as you dlopen the nightly libtest).
The offending line is
https://github.com/rust-lang/rust/blob/ddd123ed9a35ec76103d42cecf322ee8d2896bf9/src/libbacktrace/macho.c#L225
Which means backtrace_get_view (which mmaps the file) is silently returning an invalid address. Now I'm very confused, because it's a bog-standard call to mmap that doesn't touch any non-constant parameters except the file descriptor (which if invalid would just cause mmap to fail). Am I missing something?
Something really hinky is definitely going on here. The memory access failure isn't the usual KERN_INVALID_ADDRESS or KERN_PROTECTION_FAILURE, it's KERN_MEMORY_ERROR (code 10) which is documented as:
/* During a page fault, the memory object indicated
* that the data could not be returned. This failure
* may be temporary; future attempts to access this
* same data may succeed, as defined by the memory
* object.
*/
Sure enough, according to vmmap, the given address is mapped correctly (with read access) to the contents of the dSYM for the main executable, exactly what the program is intending. It turns out that file has size 0 and apparently OS X will happily entrap you by letting you mmap past the end of a file. Sure enough, the man page does not list that case as an error condition (it is on e.g. Linux).
I have no idea why (a) the dSYM is empty (b) loading libtest has anything to do with it (it's probably something more general) but it appears this isn't actually a memory safety issue (since the access will always either succeed or cause SIGBUS). The most immediate solutions are to either:
backtrace_get_view to check the file length before mapping the file. This is inherently racy, but ultimately not that dangerous since it'll just result in a clean (though confusing) SIGBUS.mmap based backtrace_get_view on macOS, since libbacktrace has a simple fallback that just reads from the file, though I half expect to get a SIGPIPE or something due to some obscure behavior.In any case I don't think https://github.com/rust-lang/rust/pull/45760 is necessary, since this is a clean crash.
Nice! It does seem weird that the dSYM is empty.
The SIGBUS doesn't happen upon mapping the file, but upon access, right? It seems that even if we add a length check in backtrace_get_view there's still an issue where something is performing an out of bounds read on the mapped region.
@sfackler Yes. The read is "in bounds" in the sense that (a) it is the right address and (b) there is a virtual memory mapping at that location that will prevent something else from getting mapped there. There's no danger that e.g. a malloc will use that memory for something else until munmap is called. It's essentially a guard page, with a SIGBUS instead of SIGSEGV. That said, I think the second option is better since producing panic backtraces is not a performance-sensitive area.
Ah, sure - I was thinking more of a "semantically" out of bounds read. The underlying file is N bytes, and libbacktrace is trying to ready byte N + 10 or whatever. It's either going to fault or you'll be reading bogus zeroed data.
I agree that turning off the mmap implementation is a reasonable approach (maybe everywhere?).
I've figured out (broadly) what produces the empty dSYM: any (AFAICT) compilation with rustc --test -g. The example I posted above was a fluke; if you run cargo test your target/debug directory now has an empty dSYM with the name of the test executable in it, which will cause this crash whenever any Rust executable in the same directory tries to produce a panic backtrace.
I opened up a separate issue for that problem here.
This also affects Servo (in a #[should_panic] unit test, on OS X). I can reproduce with nightly-2017-10-23 but not nightly-2017-10-22. Corresponding range: 4279e2b4c...8493813cd
Strangely, after seeing a failure if I downgrade the compiler to a normally not-affected version I鈥檒l still see the same failure until I run cargo clean.
@SimonSapin may be that was rust-lang/cargo#4699
The test run-pass/issue-45731.rs consistently fails on macOS 10.13 with Rust 1.22.1. (I have not checked 1.22.0, stable, or master. Please tell if I should.) Here is the output: run-pass-issue-45731.log. Discovered in https://github.com/NixOS/nixpkgs/pull/31982#issuecomment-346970578.
@orivej Hi could you file a new issue for this? The log doesn't indicate a SIGBUS happened.
Most helpful comment
Regression was introduced in this range -- https://github.com/rust-lang/rust/compare/4279e2b4c...8493813cd
Between
nightly-2017-10-22andnightly-2017-10-23