When I build Firefox on Solaris SPARC it fails with:
Compiling gkrust v0.1.0 (/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/toolkit/library/rust)
Running `CARGO=/builds/psumbera/rustc-1.44.1/bin/cargo CARGO_MANIFEST_DIR=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/toolkit/library/rust CARGO_PKG_AUTHORS='[email protected]' CARGO_PKG_DESCRIPTION='Rust code for libxul' CARGO_PKG_HOMEPAGE= CARGO_PKG_NAME=gkrust CARGO_PKG_REPOSITORY= CARGO_PKG_VERSION=0.1.0 CARGO_PKG_VERSION_MAJOR=0 CARGO_PKG_VERSION_MINOR=1 CARGO_PKG_VERSION_PATCH=0 CARGO_PKG_VERSION_PRE= LD_LIBRARY_PATH='/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/release/deps:/builds/psumbera/rustc-1.44.1/lib' /builds/psumbera/rustc-1.44.1/bin/rustc --crate-name gkrust toolkit/library/rust/lib.rs --error-format=json --json=diagnostic-rendered-ansi --crate-type staticlib --emit=dep-info,link -C opt-level=2 -C panic=abort -C codegen-units=1 -C lto --cfg 'feature="bindgen"' --cfg 'feature="cubeb_pulse_rust"' --cfg 'feature="moz_memory"' --cfg 'feature="moz_places"' --cfg 'feature="quantum_render"' --cfg 'feature="servo"' -C metadata=02c56202166b3eec -C extra-filename=-02c56202166b3eec --out-dir /builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps --target sparcv9-sun-solaris -C linker=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/build/cargo-linker -L dependency=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps -L dependency=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/release/deps --extern gkrust_shared=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libgkrust_shared-9712e354bc55934b.rlib --extern mozilla_central_workspace_hack=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libmozilla_central_workspace_hack-4044d7b7b8cbdafc.rlib -C opt-level=2 --cap-lints warn -L native=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/build/lmdb-rkv-sys-c379f0737c738302/out`
error: could not compile `gkrust`.
Caused by:
process didn't exit successfully: `CARGO=/builds/psumbera/rustc-1.44.1/bin/cargo CARGO_MANIFEST_DIR=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/toolkit/library/rust CARGO_PKG_AUTHORS='[email protected]' CARGO_PKG_DESCRIPTION='Rust code for libxul' CARGO_PKG_HOMEPAGE= CARGO_PKG_NAME=gkrust CARGO_PKG_REPOSITORY= CARGO_PKG_VERSION=0.1.0 CARGO_PKG_VERSION_MAJOR=0 CARGO_PKG_VERSION_MINOR=1 CARGO_PKG_VERSION_PATCH=0 CARGO_PKG_VERSION_PRE= LD_LIBRARY_PATH='/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/release/deps:/builds/psumbera/rustc-1.44.1/lib' /builds/psumbera/rustc-1.44.1/bin/rustc --crate-name gkrust toolkit/library/rust/lib.rs --error-format=json --json=diagnostic-rendered-ansi --crate-type staticlib --emit=dep-info,link -C opt-level=2 -C panic=abort -C codegen-units=1 -C lto --cfg 'feature="bindgen"' --cfg 'feature="cubeb_pulse_rust"' --cfg 'feature="moz_memory"' --cfg 'feature="moz_places"' --cfg 'feature="quantum_render"' --cfg 'feature="servo"' -C metadata=02c56202166b3eec -C extra-filename=-02c56202166b3eec --out-dir /builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps --target sparcv9-sun-solaris -C linker=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/build/cargo-linker -L dependency=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps -L dependency=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/release/deps --extern gkrust_shared=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libgkrust_shared-9712e354bc55934b.rlib --extern mozilla_central_workspace_hack=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libmozilla_central_workspace_hack-4044d7b7b8cbdafc.rlib -C opt-level=2 --cap-lints warn -L native=/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/build/lmdb-rkv-sys-c379f0737c738302/out` (signal: 11, SIGSEGV: invalid memory reference)
make[4]: *** [/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/firefox-68.10.0/config/makefiles/rust.mk:240: force-cargo-library-build] Error 101
make[4]: Leaving directory '/builds/psumbera/userland-ff-68.10.0/components/desktop/firefox/build/sparcv9/toolkit/library/rust'
This worked well till Rust 1.42. Since Rust 1.43 it fails. There is no problem on X86 Solaris.
I was able to bisect this to following change:
e82ec2315e5adb1c291c3702cd2ac1f46ecd0fcf is the first bad commit
commit e82ec2315e5adb1c291c3702cd2ac1f46ecd0fcf
Author: Dylan MacKenzie <[email protected]>
Date: Tue Mar 3 11:26:51 2020 -0800
Use correct place for `enum_place`
PR #69562, which fixed a bug that was causing clippy to ICE, passed the
place for the *result* of `Rvalue::Discriminant` instead of the
*operand* to `apply_discriminant_switch_effect`. As a result, no effect
was applied at all, and we lost the perf benefits from marking
inactive enum variants as uninitialized.
https://github.com/rust-lang/rust/commit/e82ec2315e5adb1c291c3702cd2ac1f46ecd0fcf
@ecstatic-morse any idea what can be wrong with it on SPARC (64 bits big endian system)?
There is also core file generated:
0007fdd75ddddcf1 libc.so.1`_memcpy%sun4v-hwcap3+0x108(9fd94b2180, 10, 1, 10, 7fdd774a58000, 0)
0007fdd75dddde51 libstd-18d6dd3f485e207d.so`_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..SpecIntoVec$GT$::into_vec::h866051ec29816e57+0xb4(7fdd75dddec00, 1, 10, 0, 0, 9fd94b2210)
0007fdd75ddddf31 librustc_driver-64d0fcc570abb6ec.so`rustc_codegen_llvm::back::lto::run_fat::hcb53787b9ab702b0+0xfe4(9fd945aac0, 2, 7fdd75dddec00, 9fd945aaf8, 7fdd75dddeb68, 7fdd75dddeb61)
0007fdd75ddde431 librustc_driver-64d0fcc570abb6ec.so`rustc_codegen_ssa::back::write::generate_lto_work::h29d8f36a78087208+0xbc(7fdd75dddf370, 7fdd75dddefc8, 7fdd75dddf4f0, 7fdd75dddf3d0,
7fdd75dddf5c0, 9fda425120)
0007fdd75ddde5d1 librustc_driver-64d0fcc570abb6ec.so`std::sys_common::backtrace::__rust_begin_short_backtrace::h122f5d501f74f5dc+0x400(7fdd75dddf370, 7fdd75dddf5c8, 7fdd75dddf3d8, 0, 0,
7fdd75dddf5f8)
0007fdd75dddeee1 librustc_driver-64d0fcc570abb6ec.so`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h784dca9a9559144b+0x98(7fdd75dddfc28, 7fdd75dddfb40, 9fda3fce90,
7fdd75dddf870, 168, 0)
0007fdd75dddf5a1 libstd-18d6dd3f485e207d.so`std::sys::unix::thread::Thread::new::thread_start::h4f6832cd8c1f06fc+0x20(7fdd774004000, 7fdd77871be94, 9fda3e2c90, 9fda3fce90, 7fdd77e5558b0, 0)
0007fdd75dddf671 libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)
This is concerning. To confirm, this occurs with recent nightlies as well?
The commit you bisected to (part of #69676) re-enabled the optimization around drop elaboration in #68528, which could have plausibly caused this kind of issue. There's nothing target-dependent about that PR, so while it's possible that #68528 triggered a SPARC-only LLVM bug, it's very possible that there's a latent miscompilation on other platforms as well that manifests as a segmentation fault due to some detail of SPARC. However, it's surprising that we didn't see any issues from users on Intel machines.
The innermost frame points to this implementation detail of Cstring::new. There's no match on a type that has a drop implementation along that code path, so presumably the root cause is elsewhere. My next step for debugging this would probably be ASan, although I haven't had to use it in anger on Rust code yet. WDYT @psumbera?
Nominating for @rust-lang/compiler discussion to put this on their radar, since it could affect tier 1 platforms as well.
This is concerning. To confirm, this occurs with recent nightlies as well?
Latest version I have tested is Rust 1.44.1. I can revert the commit and I can use it to build Firefox.
I will try latest nightly build as well. But I will need to build Rust 1.45 first (there are no bootstrap archives for Solaris).
The innermost frame points to this implementation detail of
Cstring::new. There's nomatchon a type that has a drop implementation along that code path, so presumably the root cause is elsewhere. My next step for debugging this would probably be ASan, although I haven't had to use it in anger on Rust code yet. WDYT @psumbera?
Unfortunately my knowledge of Rust and its internals are very limited. Plus I believe ASan is not available for SPARC.
I will try latest nightly build as well. But I will need to build Rust 1.45 first (there are no bootstrap archives for Solaris).
I'm not able to build it now because of #74628 .
This kind of issues are often caught by LLVM assertions, so as a starting step
I would recommend enabling them if you haven't tried that already. Possibly
together with debuginfo=1 for more complete backtrace.
Assigning P-medium as discussed as part of the Prioritization Working Group procedure and removing I-prioritize.
Plus I believe ASan is not available for SPARC.
Oof. This is gonna be a tough one to figure out.
This kind of issues are often caught by LLVM assertions.
CI builds and tests rustc with LLVM assertions enabled for all tier 1 platforms. I'd be surprised if we were violating an assertion late enough in the LLVM pipeline that it only triggered on SPARC, but it doesn't hurt to check.
self-assigning to investigate whether its reasonable to replicate this atop GCC build farm or not
Here's a datapoint for you:
I have exactly the same issue on Linux/PPC64 (big endian) with rust 1.44.1. Reverting e82ec23 fixes the problem for me as well.
@zeldin when you say "exactly the same problem", are you also attempting to build firefox and seeing this failure on the gkrust crate?
Or do you have a smaller example test case to illustrate the problem on Linux/PPC64?
@pnkfelix Yes, I'm attempting to build firefox, and getting the SIGSEGV building gkrust. After rebuilding rust 1.44.1 with e823c23 reverted, I can compile firefox without problems. I have not been able to reproduce the issue with a smaller test case.
@zeldin okay, thanks. While I was hoping you'd have a smaller test case, the info you have given is nonetheless very helpful, since I think I'll have an easier time investigating this on PPC64, which is (I believe) a higher tier platform than SPARC Solaris. (At the very least, PPC64 has rustup-support...)
Just for record. Since there is no ASan for SPARC I have tried to run problematic rustc command with enabled ADISTACK and ADIHEAP (https://docs.oracle.com/cd/E37838_01/html/E61021/sysauth-adistack.html).
There is no difference with enabling or disabling them:
v4v-t7-1j-prg06 11:34 /builds/psumbera/userland/components/desktop/firefox/firefox-68.11.0: /builds/psumbera/rustc-1.44.0/bin/rustc --crate-name gkrust toolkit/library/rust/lib.rs --error-format=json --json=diagnostic-rendered-ansi --crate-type staticlib --emit=dep-info,link -C opt-level=2 -C panic=abort -C codegen-units=1 -C lto --cfg 'feature="bindgen"' --cfg 'feature="cubeb_pulse_rust"' --cfg 'feature="moz_memory"' --cfg 'feature="moz_places"' --cfg 'feature="quantum_render"' --cfg 'feature="servo"' -C metadata=02c56202166b3eec -C extra-filename=-02c56202166b3eec --out-dir /builds/psumbera/userland/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps --target sparcv9-sun-solaris -C linker=/builds/psumbera/userland/components/desktop/firefox/firefox-68.11.0/build/cargo-linker -L dependency=/builds/psumbera/userland/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps -L dependency=/builds/psumbera/userland/components/desktop/firefox/build/sparcv9/release/deps --extern gkrust_shared=/builds/psumbera/userland/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libgkrust_shared-9712e354bc55934b.rlib --extern mozilla_central_workspace_hack=/builds/psumbera/userland/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/deps/libmozilla_central_workspace_hack-4044d7b7b8cbdafc.rlib -C opt-level=2 --cap-lints warn -L native=/builds/psumbera/userland/components/desktop/firefox/build/sparcv9/sparcv9-sun-solaris/release/build/lmdb-rkv-sys-c379f0737c738302/out
Segmentation Fault (core dumped)
v4v-t7-1j-prg06 11:34 /builds/psumbera/userland/components/desktop/firefox/firefox-68.11.0: mdb core
Loading modules: [ libc.so.1 ld.so.1 ]
rustc:core> $G
C++ symbol demangling enabled
rustc:core> $C
001ffc587e7edd01 libc.so.1`_memcpy%sun4v-hwcap4+0x178(800000f72ce636e0, 2, 10, 1, 10, 800000f72ce636e0)
001ffc587e7eddb1 libstd-c4661e6d7c8d6da6.so`core::slice::_$LT$impl$u20$$u5b$T$u5d$$GT$::copy_from_slice::h166d66d659ae0c54+0x34(800000f72ce636e0, 10, 1, 10, 1ffc5888e5a000, 0)
001ffc587e7edf11 libstd-c4661e6d7c8d6da6.so`_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..SpecIntoVec$GT$::into_vec::h4795adcfb8bd244e+0xb4(1ffc587e7eecc0, 1, 10, 0, 0,
b00000f72ed3cfe0)
001ffc587e7edff1 librustc_driver-3c8c0481c1cf7450.so`rustc_codegen_llvm::back::lto::run_fat::ha2464fbe0a02f1ea+0xfe4(c00000f72b235990, 2, 1ffc587e7eecc0, c00000f72b2359c8, 1ffc587e7eec28,
1ffc587e7eec21)
001ffc587e7ee4f1 librustc_driver-3c8c0481c1cf7450.so`rustc_codegen_ssa::back::write::generate_lto_work::h2a65c10501651b11+0xbc(1ffc587e7ef430, 1ffc587e7ef088, 1ffc587e7ef5b0, 1ffc587e7ef490,
1ffc587e7ef680, b00000f72d2b3210)
001ffc587e7ee691 librustc_driver-3c8c0481c1cf7450.so`std::sys_common::backtrace::__rust_begin_short_backtrace::h9e597b6c1402aedf+0x400(1ffc587e7ef430, 1ffc587e7ef688, 1ffc587e7ef498, 0, 0,
1ffc587e7ef6b8)
001ffc587e7eefa1 librustc_driver-3c8c0481c1cf7450.so`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h89fb6c455081aef0+0x98(1ffc587e7efce8, 1ffc587e7efc00,
600000f72d28f5b0, 1ffc587e7ef930, 168, 0)
001ffc587e7ef661 libstd-c4661e6d7c8d6da6.so`std::sys::unix::thread::Thread::new::thread_start::h363b2080c72fef07+0x20(1ffc5888e86000, 1ffc588cb19bc4, b00000f72d28f770, 600000f72d28f5b0,
1ffc5892985470, 0)
001ffc587e7ef731 libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)
rustc:core>
Same issue with Rust 1.45.2 on ArchPOWER (ppc64le) with Firefox 79.0 (both from release tarballs) .
Also hit this on gentoo/ppc64le. I stumbled across this fedora bug which led me to this LLVM patch. After applying the patch to llvm and rebuilding rust, firefox and gkrust build successfully for me.
@shawnanastasio is this patch committed in LLVM 11?
Rust nightly uses LLVM 11 so it should be no longer affected.
@shawnanastasio This patch does not fix the issue for me on big endian PPC64 (and is of course unlikely to fix the OP:s problem on Solaris/SPARC). Still, it's probably a good idea to apply it if using PPC64 as a platform to look for the issue, so as not to go down the wrong rabbit hole. :-)
An update for anyone following this:
-Zprecise-enum-drop-elaboration=no.However, we soon learned that this family of optimizations was not the root cause of the issue. In fact, a recent change had caused it to manifest regardless of whether #68528 was disabled. We're (well mostly "they're" :smile:) still working to diagnose the exact cause.
It's possible that something similar is occurring on SPARC and PowerPC architectures. I'm not sure how hard it is to compile Firefox with custom RUSTFLAGS, but it would be nice to know if the miscompilation occurs on the latest nightlies with #68528 disabled.
@ecstatic-morse Tested on ppc64 with 1.49.0-nightly (25c8c53dd 2020-10-03), building firefox 79.0.
I configured with
../configure --with-clang-path=/usr/lib/llvm/10/bin/clang --with-libclang-path=/usr/lib/llvm/10/lib64/ CARGO=/tmp/rust-nigthly/bin/cargo RUSTC=/tmp/rust-nigthly/bin/rustc
However, the build broke on something unrelated:
error: expected literal
--> servo/components/style_traits/viewport.rs:12:1
|
12 | / define_css_keyword_enum! {
13 | | pub enum UserZoom {
14 | | Zoom = "zoom",
15 | | Fixed = "fixed",
16 | | }
17 | | }
| |_^
|
= note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
Not sure what is the best way to proceed here...
Also, it seems that firefox-80.0.1 can be built with rust-1.45.2 without any patches. Whether this is due to changes in rust or in firefox I do not know.
(This is the error message I get if I add RUSTFLAGS="-Z macro-backtrace" to the configure flags.)
error: expected literal
--> /tmp/firefox-79.0/third_party/rust/cssparser/src/macros.rs:52:27
|
36 | / macro_rules! match_ignore_ascii_case {
37 | | ( $input:expr,
38 | | $(
39 | | $( #[$meta: meta] )*
... |
52 | | $( $( $pattern )+ )+
| | ^^^^^^^^
... |
65 | | };
66 | | }
| |_- in this expansion of `match_ignore_ascii_case!` (#2)
|
::: servo/components/style_traits/viewport.rs:12:1
|
12 | / define_css_keyword_enum! {
13 | | pub enum UserZoom {
14 | | Zoom = "zoom",
15 | | Fixed = "fixed",
16 | | }
17 | | }
| |_- in this macro invocation (#1)
|
::: servo/components/style_traits/values.rs:468:1
|
468 | / macro_rules! define_css_keyword_enum {
469 | | (pub enum $name:ident { $($variant:ident = $css:expr,)+ }) => {
470 | | #[allow(missing_docs)]
471 | | #[cfg_attr(feature = "servo", derive(Deserialize, Serialize))]
... |
499 | / match_ignore_ascii_case! { ident,
500 | $($css => Ok($name::$variant),)+
501 | _ => Err(())
502 | | }
| |_________________- in this macro invocation (#2)
...
519 | | };
520 | | }
| |_- in this expansion of `define_css_keyword_enum!` (#1)
@zeldin: That looks like an issue caused by one of the recent fixes in the proc-macro system (None-delimited groups are now preserved in more places). I don't know how the firefox build system works, but I would try updating syn, proc_macro_hack, procedural-masquerade, or cssparser.
So, in order to sort out what is what, I tried downloading binaries of 1.44.1 and 1.45.2 from https://static.rust-lang.org/dist/ and using them to build the same firefox 79.0 tree.
options '-C embed-bitcode=no' and '-C lto' are incompatible in my face--enable-rust-debug to the configure options, then 1.45.2 builds gkrust without complaining or crashing:disappointed:
Ha, that annoying error actually seems to have been a clue:
hakua:/tmp% cat hello.rs
fn main() {
println!("Hello, world!");
}
hakua:/tmp% /tmp/rust-1.44.1/bin/rustc -C lto hello.rs
Segmentation fault
hakua:/tmp% /tmp/rust-1.45.2/bin/rustc -C lto hello.rs
hakua:/tmp% /tmp/rust-nigthly/bin/rustc -C lto hello.rs
hakua:/tmp%
So -C lto triggers the ICE with 1.44.1, but seems ok with 1.45.2 and nightly.
Bisecting 1.45 shows that this is the commit that fixed the issue:
commit 9f128235b49199f766f40df08c8a7eb25e143ae9 (refs/bisect/god)
Author: Nikita Popov <[email protected]>
Date: Tue Dec 31 15:46:46 2019 +0100
Update LLVM submodule
So, an LLVM bug then.
Based on above comment:
@rustbot modify labels: A-LLVM
This is concerning. To confirm, this occurs with recent nightlies as well?
The commit you bisected to (part of #69676) re-enabled the optimization around drop elaboration in #68528, which could have plausibly caused this kind of issue. There's nothing target-dependent about that PR, so while it's possible that #68528 triggered a SPARC-only LLVM bug, it's very possible that there's a latent miscompilation on other platforms as well that manifests as a segmentation fault due to some detail of SPARC. However, it's surprising that we didn't see _any_ issues from users on Intel machines.
The innermost frame points to this implementation detail of
Cstring::new. There's nomatchon a type that has a drop implementation along that code path, so presumably the root cause is elsewhere. My next step for debugging this would probably be ASan, although I haven't had to use it in anger on Rust code yet. WDYT @psumbera?Nominating for @rust-lang/compiler discussion to put this on their radar, since it could affect tier 1 platforms as well.
I'm seeing this bug when building firefox 1.43.0 from source on ubuntu focal (20.04) for amd64, so agreeing that this is not target-dependent.