On any recent MSVC nightly, compiling with release
profile with RUSTFLAGS = "-C target-cpu=native"
results in either STATUS_ACCESS_VIOLATION
or STATUS_HEAP_CORRUPTION
depending on the crate. Many crates work, but others don't. Among those that fail some use SIMD. target-cpu=native
resolves to target-cpu=znver1
on my machine.
This seems to be related #63361 and the LLVM upgrade, again. It does not happen when target-cpu
is not set.
Everything works on https://github.com/rust-lang/rust/commit/07e0c3651ce2a7b326f7678e135d8d5bbbbe5d18 but fails after https://github.com/rust-lang/rust/commit/38798c6d68394874686dfa3d03e56e12a3ff3d54, same as the aforementioned issue. I am not sure how to reproduce it in a single crate, but I will look into it.
LLVM 9 just doesn't like AMD.
Although, another issue of mine: https://github.com/CraneStation/cranelift/issues/900 also fails in a similar manner before the LLVM upgrade, so it's worth noting.
Here is the verbose build output:
PS F:\code\projects\active\raygon\private\raygon-test> cargo +nightly-msvc build --verbose --release
Fresh unicode-xid v0.2.0
Fresh semver-parser v0.7.0
Fresh cc v1.0.40
Fresh autocfg v0.1.6
Fresh lazy_static v1.3.0
Fresh nodrop v0.1.13
Fresh unicode-xid v0.1.0
Fresh cfg-if v0.1.9
Fresh scopeguard v1.0.0
Fresh version_check v0.1.5
Fresh rustc-demangle v0.1.16
Fresh ppv-lite86 v0.2.5
Fresh ieee754 v0.2.6
Fresh itoa v0.4.4
Fresh either v1.5.2
Fresh adler32 v1.0.3
Fresh rand_core v0.4.2
Fresh copyless v0.1.4
Fresh bytecount v0.4.0
Fresh color_quant v1.0.1
Fresh lzw v0.10.0
Fresh quote v0.3.15
Fresh glob v0.2.11
Fresh scoped_threadpool v0.1.9
Fresh inflections v1.1.1
Fresh take_mut v0.2.2
Fresh rle-decode-fast v1.0.1
Fresh arc-swap v0.3.11
Fresh bytesize v1.0.0
Fresh linked-hash-map v0.5.2
Fresh regex-syntax v0.6.11
Fresh float-ord v0.2.0
Fresh crossbeam v0.2.12
Fresh tobj v0.1.10
Fresh crossbeam-utils v0.6.6
Fresh c2-chacha v0.2.2
Fresh fast-math v0.1.1
Fresh inflate v0.3.4
Fresh inflate v0.4.5
Fresh lock_api v0.3.1
Fresh thread_local v0.3.6
Fresh proc-macro2 v1.0.1
Fresh libc v0.2.62
Fresh arrayvec v0.4.11
Fresh proc-macro2 v0.4.30
Fresh winapi v0.3.7
Fresh getrandom v0.1.11
Fresh rand_core v0.3.1
Fresh peg v0.5.7
Fresh gif v0.9.2
Fresh gif v0.10.2
Fresh quote v1.0.2
Fresh quote v0.6.13
Fresh backtrace-sys v0.1.31
Fresh ryu v1.0.0
Fresh winapi-util v0.1.2
Fresh num_cpus v1.10.1
Fresh crossbeam-queue v0.1.2
Fresh rand_core v0.5.0
Fresh byteorder v1.3.2
Fresh bitflags v1.1.0
Fresh rand v0.4.6
Fresh packed_simd v0.3.3
Fresh remove_dir_all v0.5.2
Fresh typenum v1.10.0
Fresh rand_os v0.1.3
Fresh rand_jitter v0.1.4
Fresh crc32fast v1.2.0
Fresh time v0.1.42
Fresh crossbeam-channel v0.3.9
Fresh dirs v1.0.5
Fresh clocksource v0.5.0
Fresh atty v0.2.13
Fresh num-format v0.4.0
Fresh syn v1.0.4
Fresh num-traits v0.2.8
Fresh backtrace v0.3.35
Fresh syn v0.15.44
Fresh same-file v1.0.5
Fresh rand_chacha v0.2.1
Fresh pulldown-cmark v0.2.0
Fresh tempdir v0.3.7
Fresh deflate v0.7.20
Fresh rand_chacha v0.1.1
Fresh rand_hc v0.1.0
Fresh rand_xorshift v0.1.1
Fresh rand_xoshiro v0.3.1
Fresh rand_isaac v0.1.1
Fresh rand_pcg v0.1.2
Fresh memchr v2.2.1
Fresh slog v2.5.2
Fresh rand v0.3.23
Fresh log v0.4.8
Fresh base64 v0.10.1
Fresh libflate v0.1.27
Fresh serde_derive v1.0.99
Fresh proc-macro-hack v0.5.9
Fresh deepsize_derive v0.1.1 (F:\code\projects\active\raygon\private\deps\deepsize\deepsize_derive)
Fresh error-chain v0.12.1
Fresh num-integer v0.1.41
Fresh num-traits v0.1.43
Fresh raygon-core v0.1.0 (F:\code\projects\active\raygon\private\raygon-core)
Fresh walkdir v2.2.9
Fresh rand v0.7.0
Fresh num-derive v0.2.5
Fresh png v0.15.0
Fresh rand v0.6.5
Fresh approx v0.3.2
Fresh gltf-derive v0.12.0
Fresh lifecycle-derive v0.1.0 (F:\code\projects\active\raygon\private\deps\lifecycle-derive)
Fresh term v0.5.2
Fresh slog-scope v4.1.2 (F:\code\projects\active\raygon\private\deps\slog-scope)
Fresh aho-corasick v0.7.6
Fresh fbxcel v0.4.4
Fresh slog-async v2.3.0
Fresh log v0.3.9
Fresh random_color v0.4.4
Fresh serde v1.0.99
Fresh paste-impl v0.1.6
Fresh num-iter v0.1.39
Fresh enum_primitive v0.1.1
Fresh rand_distr v0.2.1
Fresh num-rational v0.1.42
Fresh tiff v0.3.1
Fresh num-rational v0.2.2
Fresh lifecycle v0.1.0 (F:\code\projects\active\raygon\private\deps\lifecycle)
Fresh chrono v0.4.7
Fresh cgmath v0.17.0
Fresh regex v1.2.1
Fresh slog-stdlog v3.0.5
Fresh semver v0.9.0
Fresh half v1.3.0
Fresh paste v0.1.6
Fresh serde_json v1.0.40
Fresh smallvec v0.6.10
Fresh generic-array v0.13.2
Fresh png v0.11.0
Fresh bitflags_serde_shim v0.2.1
Fresh png v0.14.1
Fresh slog-term v2.4.1
Fresh rustc_version v0.2.3
Fresh deepsize v0.1.2 (F:\code\projects\active\raygon\private\deps\deepsize)
Compiling cargo_metadata v0.6.4
Fresh expr v0.1.0 (F:\code\projects\active\expr)
Fresh numeric-array v0.4.1
Compiling gltf-json v0.12.0
Fresh serde_shims v0.2.1
Running `rustc --crate-name cargo_metadata C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\cargo_metadata-0.6.4\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"backtrace\"" --cfg "feature=\"default\"" -C metadata=ef5f1281a4361110 -C extra-filename=-ef5f1281a4361110 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern error_chain=F:\code\projects\active\raygon\private\target\release\deps\liberror_chain-f51a2bcf3c86d64a.rmeta --extern semver=F:\code\projects\active\raygon\private\target\release\deps\libsemver-c876c4f33d575b41.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native`
Running `rustc --edition=2018 --crate-name gltf_json C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\gltf-json-0.12.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"extras\"" --cfg "feature=\"names\"" -C metadata=9ff3e20903dd6c8c -C extra-filename=-9ff3e20903dd6c8c --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern gltf_derive=F:\code\projects\active\raygon\private\target\release\deps\gltf_derive-4645197b1e4f4fc9.dll --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native`
Compiling raygon-geometry v0.1.0 (F:\code\projects\active\raygon\private\raygon-geometry)
Fresh memoffset v0.5.1
Fresh parking_lot_core v0.6.2
Running `rustc --edition=2018 --crate-name raygon_geometry raygon-geometry\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=39ac77e6fba3b64e -C extra-filename=-39ac77e6fba3b64e --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern copyless=F:\code\projects\active\raygon\private\target\release\deps\libcopyless-dd1647a18b2d8872.rmeta --extern deepsize=F:\code\projects\active\raygon\private\target\release\deps\libdeepsize-feae989679bc200b.rmeta --extern expr=F:\code\projects\active\raygon\private\target\release\deps\libexpr-803b5b1429934623.rmeta --extern fast_math=F:\code\projects\active\raygon\private\target\release\deps\libfast_math-63589e7c60336843.rmeta --extern half=F:\code\projects\active\raygon\private\target\release\deps\libhalf-6ff4dff974c834ba.rmeta --extern ieee754=F:\code\projects\active\raygon\private\target\release\deps\libieee754-ce756a49f0860dff.rmeta --extern num_traits=F:\code\projects\active\raygon\private\target\release\deps\libnum_traits-7a18e0f1ba17eb3e.rmeta --extern packed_simd=F:\code\projects\active\raygon\private\target\release\deps\libpacked_simd-67697b4169867fe9.rmeta --extern raygon_core=F:\code\projects\active\raygon\private\target\release\deps\libraygon_core-89ce21278a0834d2.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta -C target-cpu=native`
Fresh crossbeam-epoch v0.7.2
Compiling parking_lot v0.9.0
Compiling crossbeam-deque v0.6.3
Compiling crossbeam-deque v0.7.1
Running `rustc --edition=2018 --crate-name parking_lot C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\parking_lot-0.9.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"nightly\"" -C metadata=f5c96506dcb815d8 -C extra-filename=-f5c96506dcb815d8 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern lock_api=F:\code\projects\active\raygon\private\target\release\deps\liblock_api-759d6fb025c0d123.rmeta --extern parking_lot_core=F:\code\projects\active\raygon\private\target\release\deps\libparking_lot_core-065b8a044fbf420f.rmeta --cap-lints allow -C target-cpu=native --cfg has_sized_atomics --cfg has_checked_instant`
Running `rustc --crate-name crossbeam_deque C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-deque-0.6.3\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=7ee3d9f5a4b9f93d -C extra-filename=-7ee3d9f5a4b9f93d --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
Running `rustc --crate-name crossbeam_deque C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-deque-0.7.1\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=5980eed31e4953ff -C extra-filename=-5980eed31e4953ff --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
Compiling rayon-core v1.5.0
Running `rustc --crate-name rayon_core C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\rayon-core-1.5.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=25c3622dd04447e5 -C extra-filename=-25c3622dd04447e5 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-7ee3d9f5a4b9f93d.rmeta --extern crossbeam_queue=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_queue-ef5b77e6d85fb6eb.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --extern lazy_static=F:\code\projects\active\raygon\private\target\release\deps\liblazy_static-edeb315d8a13eb54.rmeta --extern num_cpus=F:\code\projects\active\raygon\private\target\release\deps\libnum_cpus-0a772d2f62861421.rmeta --cap-lints allow -C target-cpu=native`
Compiling crossbeam v0.7.2
Running `rustc --crate-name crossbeam C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-0.7.2\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"crossbeam-channel\"" --cfg "feature=\"crossbeam-deque\"" --cfg "feature=\"crossbeam-queue\"" --cfg "feature=\"default\"" --cfg "feature=\"nightly\"" --cfg "feature=\"std\"" -C metadata=a425dcdcd1cdbe61 -C extra-filename=-a425dcdcd1cdbe61 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern cfg_if=F:\code\projects\active\raygon\private\target\release\deps\libcfg_if-d89f1e8e289aff1a.rmeta --extern crossbeam_channel=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_channel-fa0e9e7c0ad3946c.rmeta --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-5980eed31e4953ff.rmeta --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_queue=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_queue-ef5b77e6d85fb6eb.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
Compiling slog-stdlog v4.0.0
Running `rustc --edition=2018 --crate-name slog_stdlog C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\slog-stdlog-4.0.0\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=8819e32849620aaf -C extra-filename=-8819e32849620aaf --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam-a425dcdcd1cdbe61.rmeta --extern log=F:\code\projects\active\raygon\private\target\release\deps\liblog-a5cb25dad3acaed3.rmeta --extern slog=F:\code\projects\active\raygon\private\target\release\deps\libslog-9dd0303d4e15c44a.rmeta --extern slog_scope=F:\code\projects\active\raygon\private\target\release\deps\libslog_scope-db6d0649eb15794a.rmeta --cap-lints allow -C target-cpu=native`
Compiling slog-envlogger v2.2.0
Running `rustc --crate-name slog_envlogger C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\slog-envlogger-2.2.0\src/lib.rs --color always --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"regex\"" -C metadata=e27a333ddc8f0523 -C extra-filename=-e27a333ddc8f0523 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern log=F:\code\projects\active\raygon\private\target\release\deps\liblog-a5cb25dad3acaed3.rmeta --extern regex=F:\code\projects\active\raygon\private\target\release\deps\libregex-6e0be316675b61d9.rmeta --extern slog=F:\code\projects\active\raygon\private\target\release\deps\libslog-9dd0303d4e15c44a.rmeta --extern slog_async=F:\code\projects\active\raygon\private\target\release\deps\libslog_async-9f929658b0d160d5.rmeta --extern slog_scope=F:\code\projects\active\raygon\private\target\release\deps\libslog_scope-db6d0649eb15794a.rmeta --extern slog_stdlog=F:\code\projects\active\raygon\private\target\release\deps\libslog_stdlog-8819e32849620aaf.rmeta --extern slog_term=F:\code\projects\active\raygon\private\target\release\deps\libslog_term-4a83c400864e5d10.rmeta --cap-lints allow -C target-cpu=native`
Compiling rayon v1.1.0
Running `rustc --crate-name rayon C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\rayon-1.1.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=4fd697185ba64dd3 -C extra-filename=-4fd697185ba64dd3 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-7ee3d9f5a4b9f93d.rmeta --extern either=F:\code\projects\active\raygon\private\target\release\deps\libeither-91f2bccb9927ab72.rmeta --extern rayon_core=F:\code\projects\active\raygon\private\target\release\deps\librayon_core-25c3622dd04447e5.rmeta --cap-lints allow -C target-cpu=native`
error: Could not compile `cargo_metadata`.
Caused by:
process didn't exit successfully: `rustc --crate-name cargo_metadata C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\cargo_metadata-0.6.4\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"backtrace\"" --cfg "feature=\"default\"" -C
metadata=ef5f1281a4361110 -C extra-filename=-ef5f1281a4361110 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern error_chain=F:\code\projects\active\raygon\private\target\release\deps\liberror_chain-f51a2bcf3c86d64a.rmeta --extern semver=F:\code\projects\active\raygon\private\target\release\deps\libsemver-c876c4f33d575b41.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
warning: build failed, waiting for other jobs to finish...
error: Could not compile `gltf-json`.
Caused by:
process didn't exit successfully: `rustc --edition=2018 --crate-name gltf_json C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\gltf-json-0.12.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"extras\""
--cfg "feature=\"names\"" -C metadata=9ff3e20903dd6c8c -C extra-filename=-9ff3e20903dd6c8c --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern gltf_derive=F:\code\projects\active\raygon\private\target\release\deps\gltf_derive-4645197b1e4f4fc9.dll --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
warning: build failed, waiting for other jobs to finish...
error: Could not compile `raygon-geometry`.
Caused by:
process didn't exit successfully: `rustc --edition=2018 --crate-name raygon_geometry raygon-geometry\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=39ac77e6fba3b64e -C extra-filename=-39ac77e6fba3b64e --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern copyless=F:\code\projects\active\raygon\private\target\release\deps\libcopyless-dd1647a18b2d8872.rmeta --extern deepsize=F:\code\projects\active\raygon\private\target\release\deps\libdeepsize-feae989679bc200b.rmeta --extern expr=F:\code\projects\active\raygon\private\target\release\deps\libexpr-803b5b1429934623.rmeta --extern fast_math=F:\code\projects\active\raygon\private\target\release\deps\libfast_math-63589e7c60336843.rmeta --extern half=F:\code\projects\active\raygon\private\target\release\deps\libhalf-6ff4dff974c834ba.rmeta --extern ieee754=F:\code\projects\active\raygon\private\target\release\deps\libieee754-ce756a49f0860dff.rmeta --extern num_traits=F:\code\projects\active\raygon\private\target\release\deps\libnum_traits-7a18e0f1ba17eb3e.rmeta --extern packed_simd=F:\code\projects\active\raygon\private\target\release\deps\libpacked_simd-67697b4169867fe9.rmeta --extern raygon_core=F:\code\projects\active\raygon\private\target\release\deps\libraygon_core-89ce21278a0834d2.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta -C target-cpu=native` (exit code: 0xc0000374, STATUS_HEAP_CORRUPTION)
warning: build failed, waiting for other jobs to finish...
error: build failed
The errors are the same with a clean build, but I used a subsequent attempt to cut down on the log size.
A simplified test case is simply adding cargo_metadata
to an empty crate.
[package]
name = "cpu-bug"
version = "0.1.0"
authors = ["novacrazy <[email protected]>"]
edition = "2018"
[dependencies]
cargo_metadata = "0.8.2"
[profile.release] # My release profile
opt-level = 3
lto = 'fat'
incremental = false
debug-assertions = false
codegen-units = 1
extern crate cargo_metadata;
fn main() {
println!("Hello, world!");
}
$env:RUSTFLAGS = "-C target-cpu=znver1"
cargo run --release
codegen-units=1
seems to be partially responsible. Removing that fixes it. So it's not LTO at least.
I can also trigger this on the dev
profile by changing:
[profile.dev]
opt-level = 3
codegen-units = 1
with RUSTFLAGS = "-C target-cpu=znver1"
opt-level=2
does not trigger it.
Checking in from @rust-lang/compiler triage:
This seems to be related to our LLVM upgrade. The linked issue (#63361) was blamed on LLVM bug 42935 and fixed by @nikic via a LLVM submodule update (https://github.com/rust-lang/rust/pull/63415).
cc @nikic and @nagisa -- Any thoughts on what's going on here?
Tagging as P-high for now. Not sure who to assign to.
(Sound this be labeled as I-unsound?)
Is it possible to get a backtrace for the segfault? I don't have a windows system (or a zen system for that matter) to reproduce this on.
I don't know whether Windows has assertion-enabled builds, but if it does, it might be worth calling https://github.com/kennytm/rustup-toolchain-install-master with the -a
argument and check if the toolchain this downloads triggers an assertion failure.
(Sound this be labeled as I-unsound?)
Probably not. The crash is in the compiler process, not in code it output, and presumably in C++ code at that, so there's no reason to expect there's anything going wrong with any safe Rust code. While it's the scary sort of crash that sounds like it could hypothetically also result in completely bogus machine code being generated, there's no evidence of this actually happening / being possible. I mean I guess we could decide to tag stuff I-unsound merely because "something's gone really wrong in the C++ and that could have arbitrarily bad consequences", but if we do that we should also blanket tag all LLVM assertion failures as I-unsound, but we don't currently do that nor do I think it would be useful.
Could not reproduce with https://github.com/rust-lang/rust/issues/63959#issuecomment-525482889 or https://github.com/rust-lang/rust/issues/63959#issuecomment-525515756 on on Zen 2000 based system using Linux GNU and by cross compiling to Windows GNU, I'll check native Windows GNU toolchain later.
Happens with both MSVC and GNU builds on Windows for me.
How would I go about enabling backtraces on a Windows build of rustc? EDIT: Would rustc even produce a backtrace on segfault? I know I've seen proper backtraces with ICEs, but this is different.
9b91b9c10e3c87ed333a1e34c4f46ed68f1eee06-alt
(just the alt version of the last nightly I had) does not appear to respond to RUST_BACKTRACE=1
It's not rustc
panic but LLVM segfault so you should use gdb
with Windows GNU toolchain, no idea about MSVC.
On Windows rustc
exits with 0xc0000005
and GDB only prints: No stack.
. There are no alternative builds for Windows GNU toolchain so I won't be able to do anything until I do debug build.
On Windows rustc exits with 0xc0000005 and GDB only prints: No stack.. There are no alternative builds for Windows GNU toolchain so I won't be able to do anything until I do debug build.
Make sure you running the real rustc
and not the wrapper from rustup. What you鈥檙e seeing here is a typical symptom of failing to account for the wrapper.
Hmm, I could swear I could debug rustc
crash on Linux without caring about the wrapper.
Anyway the stack is corrupt:
Click here to expand
#0 0x000000006399edb4 in syn::path::parsing::<impl syn::path::Path>::get_ident () from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#1 0x00000000638ceef1 in core::iter::traits::iterator::Iterator::try_for_each::call::{{closure}} ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#2 0x00000000638c1582 in <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::next ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#3 0x00000000638eddf8 in serde_derive::internals::attr::Variant::from_ast ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#4 0x00000000638dbc20 in <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::next ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#5 0x00000000638e0c38 in serde_derive::internals::ast::Container::from_ast ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#6 0x0000000063907224 in serde_derive::de::expand_derive_deserialize ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#7 0x0000000063964e72 in serde_derive::derive_deserialize ()
from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#8 0x00000000639caa44 in proc_macro::bridge::client::__run_expand1::{{closure}}::{{closure}} () at src\libproc_macro\bridge/client.rs:358
#9 proc_macro::bridge::scoped_cell::ScopedCell<T>::set::{{closure}} ()
at src\libproc_macro\bridge/scoped_cell.rs:79
#10 proc_macro::bridge::scoped_cell::ScopedCell<T>::replace ()
at src\libproc_macro\bridge/scoped_cell.rs:74
#11 proc_macro::bridge::scoped_cell::ScopedCell<T>::set ()
at src\libproc_macro\bridge/scoped_cell.rs:79
#12 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter::{{closure}} () at src\libproc_macro\bridge/client.rs:309
#13 std::thread::local::LocalKey<T>::try_with ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd\thread/local.rs:262
#14 std::thread::local::LocalKey<T>::with ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd\thread/local.rs:239
#15 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter ()
at src\libproc_macro\bridge/client.rs:309
#16 proc_macro::bridge::client::__run_expand1::{{closure}} ()
at src\libproc_macro\bridge/client.rs:351
#17 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panic.rs:315
#18 std::panicking::try::do_call ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panicking.rs:296
#19 0x0000000063a28019 in __rust_maybe_catch_panic ()
at src\libpanic_unwind\lib.rs:80
#20 0x00000000639d100e in std::panicking::try ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panicking.rs:275
#21 std::panic::catch_unwind ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panic.rs:394
#22 proc_macro::bridge::client::__run_expand1 ()
at src\libproc_macro\bridge/client.rs:350
#23 0x0000000002ad0c6d in proc_macro::bridge::server::run_server ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#24 0x0000000002bba233 in <syntax::ext::proc_macro::ProcMacroDerive as syntax::ext::base::MultiItemModifier>::expand ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#25 0x0000000002bafa58 in syntax::ext::expand::MacroExpander::fully_expand_fragment ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#26 0x0000000002baeabd in syntax::ext::expand::MacroExpander::expand_crate ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#27 0x0000000000ff6312 in rustc_interface::passes::configure_and_expand_inner::{{closure}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#28 0x0000000000fe89a7 in rustc::util::common::time ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#29 0x0000000000f6859d in rustc_interface::passes::configure_and_expand_inner
()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#30 0x0000000000fcb48d in rustc_interface::passes::configure_and_expand::{{closure}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#31 0x0000000000f9efff in rustc_data_structures::box_region::PinnedGenerator<I,A,R>::new ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#32 0x0000000000f6ee01 in rustc_interface::queries::Query<T>::compute ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#33 0x0000000000ff73da in rustc_interface::queries::<impl rustc_interface::interface::Compiler>::expansion ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#34 0x0000000000e75a68 in rustc_interface::interface::run_compiler_in_existing_thread_pool ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#35 0x0000000000e9c8af in std::thread::local::LocalKey<T>::with ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#36 0x0000000000eb0eda in scoped_tls::ScopedKey<T>::set ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#37 0x0000000000ecf781 in syntax::with_globals ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#38 0x0000000000e4fb2d in std::sys_common::backtrace::__rust_begin_short_backtrace ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#39 0x0000000066c75859 in __rust_maybe_catch_panic ()
at src\libpanic_unwind\lib.rs:80
#40 0x0000000000e781b3 in core::ops::function::FnOnce::call_once{{vtable-shim}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#41 0x0000000066c46836 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\liballoc/boxed.rs:922
#42 0x0000000066c72dd7 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\liballoc/boxed.rs:922
#43 std::sys_common::thread::start_thread ()
at src\libstd\sys_common/thread.rs:13
#44 std::sys::windows::thread::Thread::new::thread_start ()
at src\libstd\sys\windows/thread.rs:47
#45 0x00007ffcaec77bd4 in KERNEL32!BaseThreadInitThunk ()
from C:\WINDOWS\System32\kernel32.dll
#46 0x00007ffcaedcce71 in ntdll!RtlUserThreadStart ()
from C:\WINDOWS\SYSTEM32\ntdll.dll
#47 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Other threads are just waiting.
I'll return with debug Rust build if I somehow manage to build LLVM in debug mode on PC with 16 GiB of RAM...
What led you to the conclusion of a corrupt stack? It looks fairly reasonable to me.
Hmm, I could swear I could debug
rustc
crash on Linux without caring about the wrapper.
I think the wrapper switched to exec
at some point (i.e. the process gets reused, without forking).
AFAIK, that would allow debugging to continue to the real rustc
.
I don't think anything like this is possible on Windows (without manually loading the executable you want to run in your address space, of course).
I'll return with debug Rust build if I somehow manage to build LLVM in debug mode on PC with 16 GiB of RAM...
You don't need to, this is not in LLVM, it's in syn
. You can probably reproduce with just cargo check
(or rustc --emit=meta
/ rustc --pretty=expanded
).
What led you to the conclusion of a corrupt stack? It looks fairly reasonable to me.
Process gave me exit code for stack corruption before, took quick look at the trace and it didn't make any sense to me.
On second look I noticed the crash happened inside proc macro...
You don't need to, this is not in LLVM, it's in syn. You can probably reproduce with just cargo check (or rustc --emit=meta / rustc --pretty=expanded).
Yeah, it hit me later. Running cargo check
on cargo-metadata
crate with changes from https://github.com/rust-lang/rust/issues/63959#issuecomment-525515756 reliably reproduces it.
In case you find it useful here is trace from debug build and disassembly: https://gist.github.com/mati865/e93d3bf12408df00ecf47327fa196af7
Assembly for znver1
and generic get_ident
is the same so the problem is somewhere earlier. What is the best way to proceed here, compiler with assertions or tearing down cargo-metadata
to something more handy?
@mati865 Since AFAICT the bug happens during the macro expansion in cargo-metadata
, you should be able to get rid of most of it.
Not even names need to be resolvable, other than invoking serde_derive
's macros.
So, for example, you can remove all dependencies of cargo-metadata
, other than serde_derive
, because they're not needed in the reproduction.
I'm worried this is a miscompilation of rustc
/std
itself, at this point.
EDIT: wait, no, it must be code compiled with -C target-cpu
that's getting miscompiled, so it's all within serde_derive
/syn
.
Could you try to run ./x.py test --stage 1 src/test/ui
with -C target-cpu=znver1
hardcoded somewhere? (presumably in src/tools/compiletest
)
I've been struggling few past 2 days to build Rust because of https://github.com/rust-lang/rust/issues/61561 so I'm unable to progress on this issue.
Any luck with this? I'm still stuck on 07e0c3651ce2a7b326f7678e135d8d5bbbbe5d18 because of it.
Odd. On my personal project, this appears mostly fixed on the latest nightly (6ef275e6c 2019-09-24
), but the issue still occurs on the cargo_metadata
example.
visiting for triage. Its not clear to me whom on our team can investigate this; has anyone from @rust-lang/compiler managed to reproduce this locally? Sounds like the answer to that is likely "no."
Just noticed it's fully broken again on 702b45e40 2019-10-01
.
triage: @mati865 are you still looking at this? should I try to identify someone else with this hardware setup who might be able to help further?
I'll try to find time to setup environment with older C toolchain today (to workaround https://github.com/rust-lang/rust/issues/61561).
Status update:
I built current master for windows-gnu
with verify-llvm-ir
and assertions for both rustc and LLVM but it does not reproduce the crash.
I can still reproduce in on nightly though.
visiting for triage.
@mati865 , things certainly sound fishy (and frustrating) when a local build does not replicate the problem but the CI-produced nightly artifacts does.
Would it be feasible for you to replicate something closer to what the CI setup is, e.g. via a docker image?
Finally reproduced with local build but experiment proposed by @eddyb only broke 4 unrelated tests (cross compilation, simd detection...).
Tested passing either -Ctarget-cpu=znver1
or -Ctarget-cpu=znver1 -Copt-level=3 -Ccodegen-units=1
.
Back to the drawing board I guess.
Also had this happen spuriously on the Windows GNU toolchain (latest nightly), but running cargo build
again continued where it left off and finished just fine.
Is there anything I can do to help speed up fixing this issue? It's been nearly two months since I've been able to compile benchmarks, and it's interfering with my work.
I don't actually use cargo_metadata
directly, it was just one of the many crates that was previously crashing rustc. However, on my system criterion
almost always crashes rustc, or even somehow compiles but hangs during execution (#65618). It's very unpredictable, and sometimes works on tiny one-crate projects, but for my primary work project workspace, it's totally unusable. Even compiling non-criterion benchmarks crashes rustc sometimes. I can't even compile Criterion as a regular crate binary (rather than using the bench
profile).
Amazingly my primary binary somehow started compiling again about a month ago, or else I'd be totally screwed, but everything else is broken on Zen 1. At this rate I'll have to roll my own benchmarking tools to continue my work.
Removed previous comment as it was just PEBCAK.
@eddyb not sure it it helps but I reduced cargo_metadata
to this:
extern crate serde;
#[macro_use]
extern crate serde_derive;
#[derive(Serialize)]
pub enum A {
///
B,
}
It does not reproduce directly but crashes when another crate depends on crate with this snippet.
It appears it requires specific combination: derive (de)serialize, enum, doc comment inside, -Ctarget-cpu=znver1
rustc argument and Windows as the OS. Otherwise it just works.
Anybody got ideas?
Backtrace
(gdb) bt
#0 core::option::Option<T>::is_some (self=<optimized out>)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore/option.rs:186
#1 core::option::Option<T>::is_none (self=<optimized out>)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore/option.rs:209
#2 syn::path::parsing::<impl syn::path::Path>::get_ident (self=0x0)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\syn-1.0.5\src/path.rs:478
#3 0x000000000b5b155c in syn::path::parsing::<impl syn::path::Path>::is_ident
(self=0x0, ident=...)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\syn-1.0.5\src/path.rs:460
#4 serde_derive::internals::symbol::<impl core::cmp::PartialEq<serde_derive::internals::symbol::Symbol> for syn::path::Path>::eq (self=0x0,
word=<optimized out>)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/symbol.rs:53
#5 serde_derive::internals::attr::get_serde_meta_items (attr=0x0)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/attr.rs:1632
#6 core::ops::function::FnMut::call_mut ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\ops/function.rs:152
#7 core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut (self=<optimized out>, args=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\ops/function.rs:265
#8 core::iter::traits::iterator::Iterator::find_map::check::{{closure}} (
x=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\traits/iterator.rs:2009
#9 core::iter::traits::iterator::Iterator::try_fold (self=0x49dfe00, f=...,
init=<optimized out>)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\traits/iterator.rs:1694
#10 core::iter::traits::iterator::Iterator::find_map (self=0x49dfe00,
f=0x49dfe00)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\traits/iterator.rs:2015
#11 <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::next (self=0x49dfe00)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\adapters/mod.rs:966
#12 0x000000000b5ddf18 in serde_derive::internals::attr::Variant::from_ast (
cx=0x49e1420, variant=0xab252b0)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/attr.rs:909
#13 0x000000000b5cbfb0 in serde_derive::internals::ast::enum_from_ast::{{closure}} (variant=0xab252b0)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/ast.rs:148
#14 core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once (self=0x49e0750, args=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\ops/function.rs:275
#15 core::option::Option<T>::map (self=..., f=0x49e0750)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore/option.rs:447
#16 <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::next (self=0x49e0740)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\adapters/mod.rs:710
#17 0x000000000b5d0e48 in <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter (iterator=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\liballoc/vec.rs:1943
#18 <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter (iter=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\liballoc/vec.rs:1856
#19 core::iter::traits::iterator::Iterator::collect (self=...)
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libcore\iter\traits/iterator.rs:1478
#20 serde_derive::internals::ast::enum_from_ast (cx=0x49e1420,
variants=0x49e1500, container_default=<optimized out>)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/ast.rs:145
#21 serde_derive::internals::ast::Container::from_ast (cx=0x49e1420,
item=0x49e1440, derive=serde_derive::internals::Derive::Serialize)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\internals/ast.rs:72
#22 0x000000000b651e0a in serde_derive::ser::expand_derive_serialize (
input=0x49e1440)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src/ser.rs:14
#23 serde_derive::derive_serialize (input=...)
at C:\Users\mateusz\.cargo\registry\src\github.com-1ecc6299db9ec823\serde_derive-1.0.99\src\lib.rs:82
#24 0x000000000b6ba1a4 in proc_macro::bridge::client::__run_expand1::{{closure}}::{{closure}} () at src\libproc_macro\bridge/client.rs:358
#25 proc_macro::bridge::scoped_cell::ScopedCell<T>::set::{{closure}} ()
at src\libproc_macro\bridge/scoped_cell.rs:79
#26 proc_macro::bridge::scoped_cell::ScopedCell<T>::replace ()
at src\libproc_macro\bridge/scoped_cell.rs:74
#27 proc_macro::bridge::scoped_cell::ScopedCell<T>::set ()
at src\libproc_macro\bridge/scoped_cell.rs:79
#28 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter::{{closure}} () at src\libproc_macro\bridge/client.rs:309
#29 std::thread::local::LocalKey<T>::try_with ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd\thread/local.rs:262
#30 std::thread::local::LocalKey<T>::with ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd\thread/local.rs:239
#31 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter ()
at src\libproc_macro\bridge/client.rs:309
#32 proc_macro::bridge::client::__run_expand1::{{closure}} ()
at src\libproc_macro\bridge/client.rs:351
#33 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd/panic.rs:317
#34 std::panicking::try::do_call ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd/panicking.rs:288
#35 0x000000000b71b469 in __rust_maybe_catch_panic ()
at src\libpanic_unwind\lib.rs:80
#36 0x000000000b6c0f5e in std::panicking::try ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd/panicking.rs:267
#37 std::panic::catch_unwind ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\libstd/panic.rs:396
#38 proc_macro::bridge::client::__run_expand1 ()
at src\libproc_macro\bridge/client.rs:350
#39 0x000000006d281a8d in proc_macro::bridge::server::run_server ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#40 0x000000006d202f2f in <syntax::ext::proc_macro::ProcMacroDerive as syntax::ext::base::MultiItemModifier>::expand ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#41 0x000000006d36cefa in syntax::ext::expand::MacroExpander::fully_expand_fragment ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#42 0x000000006d36bcf8 in syntax::ext::expand::MacroExpander::expand_crate ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#43 0x000000006b70c189 in rustc_interface::passes::configure_and_expand_inner::{{closure}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#44 0x000000006b7099b7 in rustc::util::common::time ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#45 0x000000006b6ee7c3 in rustc_interface::passes::configure_and_expand_inner
()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#46 0x000000006b74dabb in rustc_interface::passes::configure_and_expand::{{closure}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#47 0x000000006b727f8f in rustc_data_structures::box_region::PinnedGenerator<I,A,R>::new ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#48 0x000000006b6f5871 in rustc_interface::queries::Query<T>::compute ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#49 0x000000006b74e56a in rustc_interface::queries::<impl rustc_interface::interface::Compiler>::expansion ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#50 0x000000006b65565f in std::thread::local::LocalKey<T>::with ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#51 0x000000006b63ff1a in scoped_tls::ScopedKey<T>::set ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#52 0x000000006b65b321 in syntax::with_globals ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#53 0x000000006b67489d in std::sys_common::backtrace::__rust_begin_short_backtrace ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#54 0x0000000000ee8839 in __rust_maybe_catch_panic ()
at src\libpanic_unwind\lib.rs:80
#55 0x000000006b684d13 in core::ops::function::FnOnce::call_once{{vtable-shim}} ()
from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-103d8e73eb1d1085.dll
#56 0x0000000000eb8176 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\liballoc/boxed.rs:932
#57 0x0000000000ee5de7 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
at /rustc/0e8a4b441c5da21a2cb19448728ade5baa299c66\src\liballoc/boxed.rs:932
#58 std::sys_common::thread::start_thread ()
at src\libstd\sys_common/thread.rs:13
#59 std::sys::windows::thread::Thread::new::thread_start ()
at src\libstd\sys\windows/thread.rs:47
#60 0x00007fff34fd7bd4 in KERNEL32!BaseThreadInitThunk ()
from C:\WINDOWS\System32\kernel32.dll
#61 0x00007fff3558ced1 in ntdll!RtlUserThreadStart ()
from C:\WINDOWS\SYSTEM32\ntdll.dll
#62 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
You should use cargo expand
to get the expanded version of that and start reducing.
I doubt you need serde, just a few traits from it that you could copy over etc.
EDIT: disregard me, that makes no sense, the crash happens in the proc macro.
@eddyb I tried cargo expand
while working on latest comment. It produced this snippet
but it does not crash when using it from another crate:
#![feature(prelude_import)]
#![no_std]
#[prelude_import]
use ::std::prelude::v1::*;
#[macro_use]
extern crate std;
extern crate serde;
#[macro_use]
extern crate serde_derive;
pub enum A {
///
B,
}
#[allow(non_upper_case_globals, unused_attributes, unused_qualifications)]
const _IMPL_SERIALIZE_FOR_A: () = {
#[allow(unknown_lints)]
#[allow(rust_2018_idioms)]
extern crate serde as _serde;
#[automatically_derived]
impl _serde::Serialize for A {
fn serialize<__S>(&self, __serializer: __S) -> _serde::export::Result<__S::Ok, __S::Error>
where
__S: _serde::Serializer,
{
match *self {
A::B => _serde::Serializer::serialize_unit_variant(__serializer, "A", 0u32, "B"),
}
}
}
};
I'm available on both Discord and Zulip btw.
I encountered another odd behavior with these crashes when creating a crate specifically for custom benchmarks in my workspace.
The file structure is simply the following, with only one binary, because I just started:
src/
lib.rs
bin/
bench1.rs
Running the specific binary like cargo run bench1 --release
appears to work fine right now, no rustc crashes.
However, earlier I accidentally typed cargo run --release
instead of cargo run bench1 --release
, and it crashed on compiling one of the other crates in the workspace. One that was already compiled and cached, at that.
Normally it would simply compile the only binary available, but instead it crashes, as if something about specifying the binary name prevents it from crashing. I thought that odd enough to mention.
Actually, I take that back. I was passing the binary name wrong,
If I instead do cargo run --bin bench1 --release
, it also crashes, even with a single binary.
This makes it impossible to have more than one binary, so I guess I'm screwed again.
One binary magically works, but only if I pass its name incorrectly.
Apparently the presence of some crates can magically allow other (previously crashing) crates to successfully compile. Adding structopt
unbreaks things in same cases, probably because of some chain of features reaching all the way down into syn
.
Something is seriously broken in rustc, and I'm amazed this isn't getting more attention. I'm betting my personal future on Rust, and it can't even compile benchmarks on my system.
Is AMD not a supported platform anymore?
EDIT: I apologize. It's easy to become emotional with so much personal time invested in projects.
Is AMD not a supported platform anymore?
I works perfectly fine on Linux when targetting znver1.
Something is seriously broken in rustc
Something is broken in LLVM but it's very hard to narrow it.
I'm betting my personal future on Rust, [...]
I do not recommend doing this.
@eddyb I tried
cargo expand
while working on latest comment. It produced this snippet
but it does not crash when using it from another crate:
Oops, while suggesting cargo expand
I forgot that the crash is in the proc macro itself.
So far, I think what's happening here is:
serde_derive
) is compiled with target-cpu=znver1
Ideally we should be able to get a reproduction without part 2, i.e. without the compiler being involved in executing the miscompiled code. This could be done with e.g. proc-macro2
.
I realized recently that we happen to have one laptop with a Ryzen CPU in the office, if I get access to it I'll post the results of trying to reproduce/reduce this on it.
Confirmed repro on that laptop with RUSTFLAGS=-Ctarget-cpu=znver1
and:
[package]
edition = "2018"
[dependencies]
serde = "1"
serde_derive = "1"
[profile.release]
codegen-units = 1
#[derive(serde_derive::Serialize)]
enum A {
#[allow()] X,
}
If you need something simpler than serde
, I learned this morning that no_panic also triggers the miscompilation sometimes:
[package]
edition = "2018"
[dependencies]
no-panic = "0.1.11"
[profile.release]
codegen-units = 1
use no_panic::no_panic;
#[no_panic]
fn demo(s: &str) -> &str {
&s[1..]
}
fn main() {
println!("{}", demo("input string"));
}
``
Running
rustc --edition=2018 --crate-name cpu_bug src\main.rs --color always --crate-type bin --emit=dep-info,link -C opt-level=3 -C codegen-units=1 -C debuginfo=2 -C debug-assertions=on -C metadata=2a806a81570c94f1 --out-dir F:\code\projects\bugs\cpu-bug\target\debug\deps -C incremental=F:\code\projects\bugs\cpu-bug\target\debug\incremental -L dependency=F:\code\projects\bugs\cpu-bug\target\debug\deps --extern no_panic=F:\code\projects\bugs\cpu-bug\target\debug\deps\no_panic-9d376b9b716382ce.dll -C target-cpu=znver1
error: could not compile
cpu-bug`.
Caused by:
process didn't exit successfully: rustc --edition=2018 --crate-name cpu_bug src\main.rs --color always --crate-type bin --emit=dep-info,link -C opt-level=3 -C codegen-units=1 -C debuginfo=2 -C debug-assertions=on -C metadata=2a806a81570c94f1 --out-dir F:\code\projects\bugs\cpu-bug\target\debug\deps -C incremental=F:\code\projects\bugs\cpu-bug\target\debug\incremental -L dependency=F:\code\projects\bugs\cpu-bug\target\debug\deps --extern no_panic=F:\code\projects\bugs\cpu-bug\target\debug\deps\no_panic-9d376b9b716382ce.dll -C target-cpu=znver1
(exit code: 0xc0000374, STATUS_HEAP_CORRUPTION)
So far reduced to cargo run --release
with RUSTFLAGS=-Ctarget-cpu=znver1
and:
[dependencies]
syn = "1"
[profile.release]
codegen-units = 1
use syn::parse::{Parse, Parser};
fn main() {
syn::DeriveInput::parse.parse_str("enum A { X }").unwrap();
}
No more proc macros are involved, eliminating part 2 from https://github.com/rust-lang/rust/issues/63959#issuecomment-549774373.
However, this leaves the entirety of syn
left to reduce.
I'm now left with ~1000 lines of proc-macro2
and ~300 lines of syn
, no dependencies other than std
left, not even on the builtin proc_macro
(which is good, because I'd rather not reduce that).
As you might be able to tell from types such as [u64; 26]
, certain types appear to only matter for their size, and there's also a dance like let x = *Box::new(x);
at some point.
Would be interesting to see if I can remove all of the unsafe
code from the syn
side, because until then it will remain a potential suspect, however unlikely that may be.
The actual crash appears to be happening as a direct result of Cursor::bump
. Somehow self.ptr
is pointing at the end of an array, and incrementing past that causes the corruption. You can check with
unsafe fn bump(self) -> Cursor<'a> {
assert_ne!(self.ptr, self.scope);
Cursor::create(self.ptr.offset(1), self.scope)
}
which shows different output with/without codegen-units=1
EDIT: No, I changed something and that caused different results... but the following is still true
It looks as if the comparison of ptr == scope
here is being optimized away:
unsafe fn create(mut ptr: *const Entry, scope: *const Entry) -> Self {
while let Entry::End(exit) = *ptr {
if ptr == scope {
break;
}
ptr = exit;
}
Cursor {
ptr,
scope,
marker: PhantomData,
}
}
if you change it to std::hint::black_box(ptr) == std::hint::black_box(scope)
, that fixes everything.
So it could be that the creation of ptr
and scope
is undefined behavior.
After lots of fiddling with it, and some false leads, the highest up I can fix it is by changing:
let value = Box::new(value);
inner.push(*value);
to
let value = Box::new(value);
inner.push(std::hint::black_box(*value));
Seems like that is the root cause of the invalid optimization, which was even stated in your last comment @eddyb . Suppose I should have tried that first.
As of Rust 1.39.0, this is now in Stable.
FWIW I hit this on a Ryzen 3000 processor as well. I couldn't repro the code in this comment. My repro is a bit larger (sorry, I spent a couple hours minimizing as much as I could) but is essentially:
#![feature(specialization)]
use pyo3::ffi::Py_TYPE;
use pyo3::prelude::*;
use pyo3::types::{IntoPyDict, PyType};
#[pyfunction]
pub fn loads<'a>(s: PyObject, py: Python) -> PyResult<PyType> {
unsafe {
let p = s.as_ptr();
let tp = Py_TYPE(p);
PyType::from_type_ptr(py, tp)
}
}
[package]
name = "simple"
version = "0.1.0"
authors = ["None <None>"]
edition = "2018"
[dependencies.pyo3]
version = "0.8"
features = ["extension-module"]
[profile.release]
codegen-units = 1
On a different note, given tier 1 platforms are "guaranteed to work", I am surprised this was allowed into stable given x86_64-pc-windows-msvc
is a tier 1 platform. I totally understand that resources are spread thin and people are busy, but I just wanted to point this out so that if it was undesired, some thought could go into ways of avoiding release blocker bugs creeping into stable. Thank you for your awesome programming language :)
@ethanhs Does it crash during compilation? If so, it's likely because pyo3-derive-backend
uses syn
.
On a different note, given tier 1 platforms are "guaranteed to work", I am surprised this was allowed into stable given
x86_64-pc-windows-msvc
is a tier 1 platform.
This does not affect all of x86_64-pc-windows-msvc
, it hasn't been reported on any Intel CPU nor non-Ryzen AMD CPUs and we're not even 100% sure it's LLVM's or rustc's fault.
Part of the reason this sat for so long was that there simply aren't that many Rust developers with Ryzen hardware around (and it doesn't appear to affect CI builds either). Ironically, my coworker has had that Ryzen laptop for months but I didn't connect the dots until recently.
I've spent a bit more time on it and I've updated my reduction.
There's no more unsafe
pointer iteration or lifetime juggling from syn
left, so I think that pretty much rules out syn
UB as the culprit.
I've also switched to #![no_std]
and brought in wee_alloc
instead, which seems to produce a consistent STATUS_ACCESS_VIOLATION
instead of STATUS_HEAP_CORRUPTION
.
Probably the "best" news is that @mati865 and I have managed to repro the crash under wine
, on both WSL and native Linux and even on my (older) IvyBridge i7!
It does appear to require compiling for Windows, still, though, in order to reproduce.
wee_alloc
's reliable crashes let me get farther with the reduction, resulting in this version, which relies on black_box
to keep LLVM cooperating, and doesn't even need wee_alloc
anymore:
(EDIT: replaced the heap corruption crash with an outright assertion failure)
(EDIT2: removed the heap allocations completely, without using any unsafe
)
(EDIT3: made the padding smaller, and replace Option<Evil>::unwrap
with a copy)
(EDIT4: made fn opaque_iter_next
an extern "win64"
to reproduce on linux + AVX)
(EDIT5: replaced opaque_iter_next
with a simpler opaque_id
)
rustc main.rs -C opt-level=3 -C codegen-units=1 -C target-cpu=znver1 --edition=2018
#![feature(test, maybe_uninit_extra)]
use core::mem::MaybeUninit;
#[inline(never)]
extern "win64" fn opaque_id<T>(x: T) -> T {
x
}
#[repr(C)]
struct Evil {
data: ([u8; 8], [u8; 8], [u8; 8]),
padding: MaybeUninit<[u64; 22]>,
}
fn main() {
let mut allocator = [MaybeUninit::uninit()];
let mut allocator = allocator.iter_mut();
loop {
let evil3 = {
let evil1 = {
if core::hint::black_box(false) {
unreachable!()
}
Evil {
data: ([1; 8], [2; 8], [3; 8]),
padding: MaybeUninit::uninit(),
}
};
let evil2 = evil1;
evil2
};
let evil4 = evil3;
let allocated = opaque_id(allocator.next()).unwrap();
let data = &allocated.write(evil4).data;
if core::hint::black_box(true) {
assert_eq!(([1; 8], [2; 8], [3; 8]), { *data });
break;
}
}
}
@eddyb
Does it crash during compilation? If so, it's likely because pyo3-derive-backend uses syn.
Yes it does! I'm pretty sure that is the reason as well.
This does not affect all of
x86_64-pc-windows-msvc
Right, I'm aware. But "guaranteed to work" reads to me that stable rustc on that platform should always work, perhaps that is a misunderstanding on my part?
Part of the reason this sat for so long was that there simply aren't that many Rust developers with Ryzen hardware around
I definitely understand. But my point is that this bug would not have entered stable if it had been marked a release blocker, and more attention would have been given to it. If I had seen people (say on my Twitter timeline) asking for someone to lend hardware with these specifications for debugging, I would gladly have volunteered. Again, I don't mean to sound like what happened was wrong, more that I am making suggestions so that regressions don't enter stable Rust (which I think we can all agree we don't want :smiley: )
Also I'm very glad to hear you've been able to minimize it further! Hopefully that can help in figuring out what the root issue is.
I think I'll stop editing my previous comment (https://github.com/rust-lang/rust/issues/63959#issuecomment-552234597).
In the final version I've numbered the evil{1,2,3,4}
variables, because they're all copies with the same value, except I can't seem to be able to simplify their weird scoping without the bug going away.
Initially I attributed the sensitivity to scoping to MIR drop order, but with Evil
not needing drop, I think it's the MIR Storage{Live,Dead}
, which end up in LLVM IR as llvm.lifetime.{start,end}
.
So this could be a stack layout overlap issue? Maybe there's something different about stack frames on Windows which can cause this? Not sure where to go from here.
EDIT: looking at the assembly, the frame sizes are definitely different between Windows and Linux, but also the calling conventions and therefore the register usage.
I likely won't spend more time on this myself, but if someone wants me to run some testcases or dump some data using the Ryzen laptop I have access to, I can help with that.
cc @nagisa @nikic @rkruppe
@rustbot ping icebreakers-llvm
Hey LLVM ICE-breakers! This bug has been identified as a good
"LLVM ICE-breaking candidate". In case it's useful, here are some
[instructions] for tackling these sorts of bugs. Maybe take a look?
Thanks! <3
cc @hdhoang @heyrutvik @jryans @mmilenko @nagisa @nikic @rkruppe @SiavoshZarrasvand @spastorino @vertexclique @vgxbj
I'm happy to take this one on. Although I lack the target CPU myself. Can I still progress using cross-compilation, or is that a requirement?
@SiavoshZarrasvand I'd suggest trying my reduced testcase with -C target-cpu=znver1
either on Windows, or on Linux via cross-compilation + Wine (IIRC, @mati865 got that to work).
If you can get the assertion failure and not a SIGILL or some other crash, then I assume that's enough to work with that testcase and dive into LLVM internals responsible for it etc.
If that doesn't work, I have a server with the needed hardware, and I could probably spin up a VM if that helps.
@ethanhs That would definitely do the trick. Pretty sure I should be able to get it to repeat on my hardware though. Let me try during the weekend and confirm on Monday.
I built the initial example on an ASUS ROG with Ubuntu 18.04. The commands I used to build and run on wine were:
cargo rustc --release --target=x86_64-pc-windows-gnu -- -C target-cpu=znver1 -C linker=x86_64-w64-mingw32-gcc
wine /path/to/target/release/cpu_bug.exe
What should I change to force the error? Somehow it compiles and runs fine for me.
# rustc --version
rustc 1.39.0 (4560ea788 2019-11-04)
@SiavoshZarrasvand Those testcases aren't useful for reproduction (especially outside of Windows on AMD Ryzen CPUs) as they crash rustc
from a proc macro, which is messy.
You should only use https://github.com/rust-lang/rust/issues/63959#issuecomment-552234597 which relies on an assertion failure instead of a crash (so if you get a crash that likely means your CPU doesn't support certain znver1
instructions).
That worked. I needed to switch to nightly and used the following to compile
rustc main.rs -C opt-level=3 -C codegen-units=1 -C target-cpu=znver1 --edition=2018 --target=x86_64-pc-windows-gnu -C linker=x86_64-w64-mingw32-gcc
Running it in wine produces following error (which I believe is what would be expected)
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `([1, 1, 1, 1, 1, 1, 1, 1], [2, 2, 2, 2, 2, 2, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3])`,
right: `([16, 250, 50, 0, 0, 0, 0, 0], [32, 32, 32, 32, 32, 32, 32, 32], [3, 3, 3, 3, 3, 3, 3, 3])`', ./src/main.rs:37:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
I dropped the ball while initially looking at the ASM diff, but then @Speedy37 pointed out that xmm6
was getting trashed (after looking at it in a debugger) so I took a closer look, ignoring (irrelevant) stack offsets.
Both use vmovaps xmmN, xmmword ptr [rip + .LCPI5_0]
followed by a later use of xmmN
to refer to that value, but both the N
and the initialization point of xmmN
differs:
xmm0
right before using itxmm6
much earlierxmm6
before/after the body of the function (i.e. it's a "callee-saved" register?)opaque_iter_next
opaque_id
in between the initialization of xmm6
and its use, suggesting LLVM has hoisted the initialization past a call because it's a callee-saved registerymm6
in between, which trashes xmm6
@nagisa confirmed that xmm6
is the first callee-saved ("non-volatile registers") xmm
register, so that fits, but why is LLVM taking advantage of that when ymm6
is also in use?
(EDIT: found the "xmm6-15 are callee-saved" part in the LLVM source)
Also, this finally explains the relevancy of the size!
One ymm
register is 4
u64
s, so you need at least 4*6+1
(25
) u64
s for ymm6
to be used.
Out of those 25
u64
s, 3
of them are in data
, and 22
are in padding
.
I've just updated https://github.com/rust-lang/rust/issues/63959#issuecomment-552234597 (I know, I said I wouldn't) with a small change to make this reproduce on a Linux Intel IvyBridge i7 laptop (i.e. it has AVX).
The only reason windows was relevant was the fact that it has callee-saved xmm
registers at all, which we can also get by making opaque_iter_next
opaque_id
an extern "win64" fn
.
@eddyb It is good that you posted here as it reminds me to update my code on my next debugging session with this issue.
With this setup I was able to get bugpoint
to reduce the LLVM IR, which I then cleaned up (mostly because bugpoint
likes to replace values with undef
) to get this:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define win64cc void @opaque() #0 {
ret void
}
define i32 @main() #0 {
start:
%dummy0 = alloca [22 x i64], align 8
%dummy1 = alloca [22 x i64], align 8
%dummy2 = alloca [22 x i64], align 8
%data = alloca <2 x i64>, align 8
br label %fake-loop
fake-loop: ; preds = %fake-loop, %start
%dummy0.cast = bitcast [22 x i64]* %dummy0 to i8*
%dummy1.cast = bitcast [22 x i64]* %dummy1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy1.cast, i8* nonnull align 8 %dummy0.cast, i64 176, i1 false)
%dummy1.cast.copy = bitcast [22 x i64]* %dummy1 to i8*
%dummy2.cast = bitcast [22 x i64]* %dummy2 to i8*
call void @llvm.lifetime.start.p0i8(i64 176, i8* nonnull %dummy2.cast)
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy2.cast, i8* nonnull align 8 %dummy1.cast.copy, i64 176, i1 false)
call win64cc void @opaque()
store <2 x i64> <i64 1010101010101010101, i64 2020202020202020202>, <2 x i64>* %data, align 8
%opaque-false = icmp eq i8 0, 1
br i1 %opaque-false, label %fake-loop, label %exit
exit: ; preds = %fake-loop
%data.cast = bitcast <2 x i64>* %data to i64*
%0 = load i64, i64* %data.cast, align 8
%1 = icmp eq i64 %0, 1010101010101010101
%2 = select i1 %1, i32 0, i32 -1
ret i32 %2
}
; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1 immarg) #1
; Function Attrs: argmemonly nounwind
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1
attributes #0 = { "target-cpu"="znver1" }
attributes #1 = { argmemonly nounwind }
(click to show assembly output as well)
.text
.file "bad.ll"
.globl opaque # -- Begin function opaque
.p2align 4, 0x90
.type opaque,@function
opaque: # @opaque
.cfi_startproc
# %bb.0:
retq
.Lfunc_end0:
.size opaque, .Lfunc_end0-opaque
.cfi_endproc
# -- End function
.section .rodata.cst16,"aM",@progbits,16
.p2align 4 # -- Begin function main
.LCPI1_0:
.quad 1010101010101010101 # 0xe04998456557eb5
.quad 2020202020202020202 # 0x1c093308acaafd6a
.text
.globl main
.p2align 4, 0x90
.type main,@function
main: # @main
.cfi_startproc
# %bb.0: # %start
subq $584, %rsp # imm = 0x248
.cfi_def_cfa_offset 592
vmovaps .LCPI1_0(%rip), %xmm6 # xmm6 = [1010101010101010101,2020202020202020202]
xorl %esi, %esi
.p2align 4, 0x90
.LBB1_1: # %fake-loop
# =>This Inner Loop Header: Depth=1
vmovups 552(%rsp), %ymm0
vmovups 536(%rsp), %ymm1
vmovups 408(%rsp), %ymm6
vmovups 472(%rsp), %ymm2
vmovups 504(%rsp), %ymm3
vmovups %ymm0, 192(%rsp)
vmovups %ymm1, 176(%rsp)
vmovups 440(%rsp), %ymm1
vmovups %ymm3, 144(%rsp)
vmovups %ymm2, 112(%rsp)
vmovups %ymm6, 48(%rsp)
vmovups %ymm3, 320(%rsp)
vmovups %ymm2, 288(%rsp)
vmovups %ymm6, 224(%rsp)
vmovups %ymm1, 80(%rsp)
vmovups %ymm1, 256(%rsp)
vmovups 192(%rsp), %ymm5
vmovups 176(%rsp), %ymm4
vmovups %ymm5, 368(%rsp)
vmovups %ymm4, 352(%rsp)
vzeroupper
callq opaque
vmovaps %xmm6, 32(%rsp)
testb %sil, %sil
jne .LBB1_1
# %bb.2: # %exit
movabsq $1010101010101010101, %rcx # imm = 0xE04998456557EB5
xorl %eax, %eax
cmpq %rcx, 32(%rsp)
sete %al
decl %eax
addq $584, %rsp # imm = 0x248
.cfi_def_cfa_offset 8
retq
.Lfunc_end1:
.size main, .Lfunc_end1-main
.cfi_endproc
# -- End function
.section ".note.GNU-stack","",@progbits
xmm6-xmm15
are callee-saved on the win64
calling convention (used by opaque
here)xmm6
being callee-saved to hoist the <i64 1010101010101010101, i64 2020202020202020202>
constant across the opaque
call and out of the (fake) loop, keeping it around in xmm6
%dummy1
-> %dummy2
and %dummy2
-> %dummy3
) result in enough AVX registers being used (ymm0
-ymm6
) to overlap with xmm6
, corrupting itstore
to %data
, whatever happened to be in ymm6
's lower half gets stored, instead of the hoisted constantEDIT: reported as https://bugs.llvm.org/show_bug.cgi?id=44140
EDIT2: and someone wrote a patch already! https://reviews.llvm.org/D70699
Did you ever learn why this only popped up on AMD Zen targets? The linked LLVM patch doesn't seem to touch target-specific code (that I see). I'm mostly asking this out curiosity.
@novacrazy IIRC someone (@mati865?) speculated a while back that the cost tables for znver1
were different enough to cause different instructions/registers to be used.
I'm guessing that would be the 256-bit AVX (ymm
) registers.
AIUI, the bug relies on xmmN
and ymmN
being distinct registers in LLVM, but which overlap in hardware (and the patch adds overlap handling to one part of LLVM which was missing it).
If LLVM uses only 128-bit (xmm
) registers for memcpy
s (or calls the C memcpy
function), then there's no chance for the bug to occur.
IIRC someone (@mati865?) speculated a while back that the cost tables for znver1 were different enough to cause different instructions/registers to be used.
Yeah, we talked about it on Discord.
Did you ever learn why this only popped up on AMD Zen targets? The linked LLVM patch doesn't seem to touch target-specific code (that I see). I'm mostly asking this out curiosity.
https://reviews.llvm.org/D70699 has single test and it uses -mcpu=znver1
. So the bug was there for long time but was exposed in LLVM 9.
I think one of recent optimisations when paired with znver1 scheduler (znver2 uses the same scheduler right now) generated code that triggered the faulty optimisation.
https://reviews.llvm.org/D70699 has landed in llvm-project: https://github.com/llvm/llvm-project/commit/9283681e168141bab9a883e48ce1da80b86afca3, but not yet in rust-lang's fork of llvm-project.
馃帀 thank you all for your hard work on fixing this!
The fix will be available in the next nightly.
Most helpful comment
With this setup I was able to get
bugpoint
to reduce the LLVM IR, which I then cleaned up (mostly becausebugpoint
likes to replace values withundef
) to get this:(click to show assembly output as well)
xmm6-xmm15
are callee-saved on thewin64
calling convention (used byopaque
here)xmm6
being callee-saved to hoist the<i64 1010101010101010101, i64 2020202020202020202>
constant across theopaque
call and out of the (fake) loop, keeping it around inxmm6
%dummy1
->%dummy2
and%dummy2
->%dummy3
) result in enough AVX registers being used (ymm0
-ymm6
) to overlap withxmm6
, corrupting itstore
to%data
, whatever happened to be inymm6
's lower half gets stored, instead of the hoisted constantEDIT: reported as https://bugs.llvm.org/show_bug.cgi?id=44140
EDIT2: and someone wrote a patch already! https://reviews.llvm.org/D70699