The test suites of a few crates are crashing with a SIGILL in the latest beta.
cc @PJB3005
Crate consistenttime also regressed (crater log).
cc @valarauca
It might be related to my crate not being thread safe. This is because the library is intended to be invoked from a proprietary C++ program (BYOND) using FFI, and said program isn't threaded at all, so I saw no need to make it thread safe and just used a mutable static.
I don't test the crate directly, but a library I made using the crate does and I did both RUST_TEST_THREADS=1 and --jobs 1 on the cargo test job for it: https://github.com/vgstation-coders/vgstation13/blob/2edf31f9d0aeea4045240571090156a8a6c88254/.travis.yml#L43
I tested your crate again with just one thread and it still crashes. Can you please manually test it?
Be sure to use the beta channel though: the error doesn't appear on stable (only beta and nightly), and the travis configuration you linked just tests on stable.
Ok yeah I can definitely confirm it's crashing due to a SIGILL on beta. How strange...
Ok I figured it out. Seems like panicking across an extern "C" causes the SIGILL. Example code causing it:
extern "C" fn foo() {
panic!("uh oh");
}
#[test]
fn does_it_break() {
foo()
}
Now whether or not this should be legal I'm not sure of, so if this was just an implementation detail I'll just remove the bad tests.
On top of this, looking at the other crate mentioned here, it definitely seems like that one also has tests checking a panic across an extern "C", so that's the same issue probably.
Panicking across FFI boundaries is definitely undefined behavior.
Yeah I don't doubt it, but on the other hand none of this is explicitly marked unsafe so I don't know how that fits in here.
This is caused by #46833 protecting from undefined behavior by aborting instead.
Huh... I would've thought you'd need unsafe to call any extern "C" function.
No reason to worry — abort is violent but memory safe. And it is now mentioned in the reference https://github.com/rust-lang-nursery/reference/pull/196
extern "C" only specifies the calling convention to use¹. I’m not sure we want to call that a "FFI boundary", as both caller and callee are (safe) Rust code, but it is also something that was never specified to not be a "FFI boundary", so breakage seems fine.
Crate mujs is also affected. cc @hean01
Crate seckey is also affected. cc @quininer
I believe this is expected -- @rust-lang/lang can you confirm that we are okay with this regression? To be clear, https://github.com/rust-lang/rust/pull/46833 made it so that we do not unwind across FFI boundaries, and instead abort. Currently inclined to close as these crates were depending on UB.
How hard would it be, in debug mode, to provide more detailed panic information in this case rather than just an abort?
FWIW, I'm fine with this being "regressed".
@joshtriplett @diwic or @nikomatsakis can probably comment on providing more information in debug mode. I think that sounds like something we could improve and add to... seems simple in theory, but may have implications I'm not aware of. There's some comments on #46833 I think.
This is expected behavior. The idea roughly is that specifying the ABI does indeed specify whether a panic is allowed -- it's not so much about the FFI boundary, as about the fact taht extern "C" functions don't have a defined way to propagate unwinding (particularly since our panic mechanism is an undefined impl detail, and hence specific to the Rust ABI).
Obviously it'd be nice to give a better error message.
Closing as "expected behavior" in this case =)
The easy answer is that the extern C ABI does not support unwinding and therefore it must be stopped. But that is not entirely accurate, because we do have a few extern C function that can start unwinding. These are marked with the #[unwind] attribute (this attribute is only available on nightly). If the unwind attribute is not present on an extern C function (and some other ABIs as well, such as e g stdcall), we tell LLVM that the function is a "nounwind" function, and unwinding from a function that is "nounwind" - that is, my friends, undefined behaviour.
Now, we could have just told LLVM that no functions are "nounwind". But - apart from possible negative impact on optimisations and/or code size - that would be a against the practical goal here, which is to avoid as many panics into FFI land as possible, where the panic could cause all sorts of unexpected problems.