The build time of html5ever has gone up by more than 70% for debug builds recently:
http://perf.rust-lang.org/compare.html?start=4745092d608e65ec869c0ebdb27c535f27606ea4&end=6a87289fa4f49b6bdd62f33f69a580026223421f&stat=instructions:u
There are quite a few candidates for introducing this regression in the changeset:
https://github.com/rust-lang/rust/compare/4745092d608e65ec869c0ebdb27c535f27606ea4...6a87289fa4f49b6bdd62f33f69a580026223421f
Would anybody from @rust-lang/wg-compiler-performance or @rust-lang/infra care to bisect further?
The "tuple stress" benchmark is affected even more while most of the other things have improved. Maybe that gives a hint?
My first guess would be quadratic or exponential blowup. Maybe someone can do a perf focus?
Yes, good idea!
Bisection for tuple-stress points to #50282 (cc @fitzgen)
Before that PR rustc main.rs took ~4s and afterwards I killed it after 2m as I didn't want to wait for it to finish
That may also be the source of the html5ever regression. cargo rustc --lib -- -Z time-passes takes 10s before that commit (the entire cargo rustc time) and 21s afterwards.
Thanks, @alexcrichton! cc also @oli-obk & @eddyb
@alexcrichton Could you post instructions for replicating the slow down? Neither @nnethercote or I could reproduce it -- despite running perf on the appropriate commit range locally.
So should we revert that PR's commits?
@Mark-Simulacrum sure yeah, using rustup-toolchain-install-master I did the equivalent of:
$ rustup-toolchain-install-master c1168be5360f17516b233be85ebb193bb4e612bf a2726846f6d6280b752019472b6bd791c0752006
$ pwd
/home/alex/code/rustc-perf
$ git rev-parse HEAD
254d28f6181cd5d20cedcd3fa9ae36df847da958
$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 collector/benchmarks/tuple-stress/src/main.rs
rustc +a2726846f6d6280b752019472b6bd791c0752006 4.58s user 0.72s system 100% cpu 5.264 total
$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf collector/benchmarks/tuple-stress/src/main.rs
^C
rustc +c1168be5360f17516b233be85ebb193bb4e612bf 109.70s user 0.60s system 99% cpu 1:50.31 total
We should probably only run the disaggregation pass if the opt level is > 1. That's what's done for the copy prop pass already
@fitzgen assuming @Mark-Simulacrum and @nnethercote can reproduce I think yeah we'll probably want to start off by reverting and possibly doing what @oli-obk suggests in a follow-up
I cannot reproduce, which is odd, since I can't tell what the difference could be that would make this much of a difference...
perf.rlo$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 collector/benchmarks/tuple-stress/src/main.rs
real 0m4.688s
user 0m4.559s
sys 0m0.160s
perf.rlo$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf collector/benchmarks/tuple-stress/src/main.rs
real 0m4.852s
user 0m4.659s
sys 0m0.219s
No repro on macOS, although the c116 version is consistently slower than a272.
~/Downloads:92$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs
real 0m6.634s
user 0m6.154s
sys 0m0.474s
~/Downloads:93$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs
real 0m6.171s
user 0m5.848s
sys 0m0.353s
~/Downloads:94$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs
real 0m6.492s
user 0m6.083s
sys 0m0.434s
~/Downloads:95$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs
real 0m6.178s
user 0m5.848s
sys 0m0.360s
Ok I swear I'm not going crazy! I think though given this data it may only be reproducible on Ubuntu 16.04? I can reproduce with this Dockerfile:
FROM ubuntu:16.04
RUN apt-get update -y && apt-get install -y curl gcc libssl-dev file make pkg-config time
RUN curl https://sh.rustup.rs | sh -s -- -y
ENV PATH=$PATH:/root/.cargo/bin
RUN cargo install rustup-toolchain-install-master
RUN rustup-toolchain-install-master \
c1168be5360f17516b233be85ebb193bb4e612bf \
a2726846f6d6280b752019472b6bd791c0752006
RUN curl -Lo main.rs \
https://github.com/rust-lang-nursery/rustc-perf/raw/254d28f6181cd5d20cedcd3fa9ae36df847da958/collector/benchmarks/tuple-stress/src/main.rs
RUN time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs
RUN time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs
Can y'all reproduce the slowdown for html5ever?
Sure enough starting the docker image from 18.04 means I cannot reproduce the regression.
Maybe some glibc bug was fixed which I was accidentally running into? The html5ever regression though is still worrisome and seems like it may not be connected to a glibc bug
馃憤 I can repro the tuple-stress slow down on 16.04.4 LTS.
-Ztime-passes indicates it is stuck at the translation item collection step.
GDB backtrace sample
#0 __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:325
#1 0x00007ffff49a23b3 in rustc::mir::BasicBlockData::expand_statements::hf1d4ebe84ecd6e03 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#2 0x00007ffff48b9bb5 in _$LT$rustc_mir..transform..deaggregator..Deaggregator$u20$as$u20$rustc_mir..transform..MirPass$GT$::run_pass::hbb873b9298ecbcc7 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#3 0x00007ffff4900f48 in rustc_mir::transform::optimized_mir::_$u7b$$u7b$closure$u7d$$u7d$::h31fd21092d0543e7 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#4 0x00007ffff48ea8d1 in rustc_mir::transform::optimized_mir::h1131a13b2ae69a43 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#5 0x00007ffff3cdb059 in rustc::ty::maps::_$LT$impl$u20$rustc..ty..maps..config..QueryConfig$LT$$u27$tcx$GT$$u20$for$u20$rustc..ty..maps..queries..optimized_mir$LT$$u27$tcx$GT$$GT$::compute::hc1ffed4360780ff2 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#6 0x00007ffff37e2a31 in rustc::dep_graph::graph::DepGraph::with_task_impl::h3ce3074a9b38daa4 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#7 0x00007ffff39a83d7 in rustc::ty::context::tls::with_related_context::h2449d2461fb7e66f () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#8 0x00007ffff3a301a2 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h6c5f6a6ee29b4c8c ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#9 0x00007ffff3a9fa43 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h6b180b58d3e77599 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#10 0x00007ffff3b0d17b in rustc::ty::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::maybe_optimized_mir::hbb43c33ed1c2e50f ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#11 0x00007ffff492c71b in _$LT$rustc_mir..interpret..eval_context..EvalContext$LT$$u27$a$C$$u20$$u27$mir$C$$u20$$u27$tcx$C$$u20$M$GT$$GT$::load_mir::hdadbb8aaab5a0add ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#12 0x00007ffff4775200 in rustc_mir::interpret::const_eval::eval_body_and_ecx::hb4b25a904afc078e () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#13 0x00007ffff4779068 in rustc_mir::interpret::const_eval::const_eval_provider::he924d56230e7faca () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#14 0x00007ffff3cdbb57 in rustc::ty::maps::_$LT$impl$u20$rustc..ty..maps..config..QueryConfig$LT$$u27$tcx$GT$$u20$for$u20$rustc..ty..maps..queries..const_eval$LT$$u27$tcx$GT$$GT$::compute::h8881ebdb8a6b3fc9 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#15 0x00007ffff37f9789 in rustc::dep_graph::graph::DepGraph::with_task_impl::h970e6f3457ee9779 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#16 0x00007ffff39fd5cf in rustc::ty::context::tls::with_related_context::hf6aa86ba669caacf () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#17 0x00007ffff3a315d8 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h767d969b981cd0e1 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#18 0x00007ffff3a898a6 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h402310c1d566e53b ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#19 0x00007ffff3b0eacd in rustc::ty::maps::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$tcx$C$$u20$$u27$lcx$GT$$GT$::const_eval::h0dc1da25520f77c1 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#20 0x00007ffff4868556 in rustc_mir::monomorphize::collector::collect_items_rec::h62a919462e2120d0 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#21 0x00007ffff4867e82 in rustc_mir::monomorphize::collector::collect_crate_mono_items::h95367eff4f34f085 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#22 0x00007fffed1dd4ba in rustc::util::common::time::h2e056e8595b0c79b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#23 0x00007fffed2a02c9 in rustc_trans::base::collect_and_partition_translation_items::h675aeb9c6f31a5ac ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#24 0x00007ffff37da225 in rustc::dep_graph::graph::DepGraph::with_task_impl::h2495eb20a211973b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#25 0x00007ffff39e38fc in rustc::ty::context::tls::with_related_context::hb6f352ea70136898 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#26 0x00007ffff3a231f0 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h20d94168c3140725 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#27 0x00007ffff3a8f472 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h47577a0ffde54cdb ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#28 0x00007fffed29e541 in rustc_trans::base::trans_crate::hee88ee89ffeda7af () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#29 0x00007fffed201440 in _$LT$rustc_trans..LlvmTransCrate$u20$as$u20$rustc_trans_utils..trans_crate..TransCrate$GT$::trans_crate::h800dcda488afb790 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#30 0x00007ffff7a2ed36 in rustc::util::common::time::hcf488eceeda0a9cc () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#31 0x00007ffff7a1ea4d in rustc_driver::driver::phase_4_translate_to_llvm::h13463ad7ae997e4b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#32 0x00007ffff7ac3016 in rustc_driver::driver::compile_input::_$u7b$$u7b$closure$u7d$$u7d$::h31e9ff1f64156b74 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#33 0x00007ffff7ab6af5 in rustc::ty::context::tls::enter_context::hb86cbd2c401dfb18 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#34 0x00007ffff7a818be in _$LT$std..thread..local..LocalKey$LT$T$GT$$GT$::with::h19074a28b3607093 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#35 0x00007ffff79ceb5c in rustc::ty::context::TyCtxt::create_and_enter::h3499c83bc74e5346 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#36 0x00007ffff7a17a0b in rustc_driver::driver::compile_input::ha789cddfb4d7cf8b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#37 0x00007ffff7ad5735 in rustc_driver::run_compiler_impl::he0bf21247a724f12 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#38 0x00007ffff7a36cf5 in _$LT$scoped_tls..ScopedKey$LT$T$GT$$GT$::set::h0a4a9cc70840e625 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#39 0x00007ffff79eae95 in syntax::with_globals::h426e57a0d566387f () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#40 0x00007ffff79dde2e in _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h31900b1f5e62de36 ()
from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#41 0x00007ffff766ff7a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#42 0x00007ffff7ad2814 in rustc_driver::run::hbcc80b82a59ea630 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#43 0x00007ffff7ae02bb in rustc_driver::main::hd5d5ac3338638be9 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#44 0x0000555555554c13 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hbdc018091d795d35 ()
#45 0x00007ffff7630073 in std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h0e173cc915a5091f () at libstd/rt.rs:59
#46 std::panicking::try::do_call::h40752f40c51cf13d () at libstd/panicking.rs:310
#47 0x00007ffff766ff7a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#48 0x00007ffff7645cf6 in std::panicking::try::h431a7a5b8edb0415 () at libstd/panicking.rs:289
#49 std::panic::catch_unwind::h1eb313d318d299fa () at libstd/panic.rs:374
#50 std::rt::lang_start_internal::h83a5cbcfc3241381 () at libstd/rt.rs:58
#51 0x0000555555554c74 in main ()
Code surrounding the call of that memmove
0x00007ffff49a2303 <+3411>: je 0x7ffff49a237b <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3531>
0x00007ffff49a2305 <+3413>: mov 0x100(%rsp),%rax
0x00007ffff49a230d <+3421>: mov %rax,0xa0(%rsp)
0x00007ffff49a2315 <+3429>: movaps 0xf0(%rsp),%xmm0
0x00007ffff49a231d <+3437>: movaps %xmm0,0x90(%rsp)
0x00007ffff49a2325 <+3445>: movaps 0xb0(%rsp),%xmm0
0x00007ffff49a232d <+3453>: movaps 0xc0(%rsp),%xmm1
0x00007ffff49a2335 <+3461>: movdqa 0xd0(%rsp),%xmm2
0x00007ffff49a233e <+3470>: movaps 0xe0(%rsp),%xmm3
0x00007ffff49a2346 <+3478>: movaps %xmm3,0x80(%rsp)
0x00007ffff49a234e <+3486>: movdqa %xmm2,0x70(%rsp)
0x00007ffff49a2354 <+3492>: movaps %xmm1,0x60(%rsp)
0x00007ffff49a2359 <+3497>: movaps %xmm0,0x50(%rsp)
0x00007ffff49a235e <+3502>: mov %r12,%rdi
0x00007ffff49a2361 <+3505>: callq 0x7ffff46c8880 <_ZN4core3ptr13drop_in_place17h3188964708fd179cE.llvm.12071625485570946806>
0x00007ffff49a2366 <+3510>: jmpq 0x7ffff49a2290 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3296>
0x00007ffff49a236b <+3515>: nopl 0x0(%rax,%rax,1)
0x00007ffff49a2370 <+3520>: movl $0x9,0xb0(%rsp)
0x00007ffff49a237b <+3531>: mov 0x368(%rsp),%rax
0x00007ffff49a2383 <+3539>: test %rax,%rax
0x00007ffff49a2386 <+3542>: je 0x7ffff49a23c4 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3604>
0x00007ffff49a2388 <+3544>: mov 0x380(%rsp),%r15
0x00007ffff49a2390 <+3552>: mov (%r15),%rcx
0x00007ffff49a2393 <+3555>: mov 0x10(%r15),%rbx
0x00007ffff49a2397 <+3559>: imul $0x58,0x360(%rsp),%rsi
0x00007ffff49a23a0 <+3568>: add %rcx,%rsi
0x00007ffff49a23a3 <+3571>: imul $0x58,%rbx,%rdi
0x00007ffff49a23a7 <+3575>: add %rcx,%rdi
0x00007ffff49a23aa <+3578>: imul $0x58,%rax,%rdx
0x00007ffff49a23ae <+3582>: callq 0x7ffff46be350 <memmove@plt>
=> 0x00007ffff49a23b3 <+3587>: add 0x368(%rsp),%rbx
0x00007ffff49a23bb <+3595>: mov %rbx,0x10(%r15)
0x00007ffff49a23bf <+3599>: mov 0x28(%rsp),%r15
0x00007ffff49a23c4 <+3604>: mov %r14,%rdi
0x00007ffff49a23c7 <+3607>: callq 0x7ffff46e3080 <_ZN71_$LT$alloc..vec..IntoIter$LT$T$GT$$u20$as$u20$core..ops..drop..Drop$GT$4drop17h62e13b79d8b9a38aE>
0x00007ffff49a23cc <+3612>: mov 0x3b0(%rsp),%rdi
0x00007ffff49a23d4 <+3620>: mov $0x38,%esi
0x00007ffff49a23d9 <+3625>: mov $0x8,%edx
0x00007ffff49a23de <+3630>: callq 0x7ffff46bbd40 <__rust_dealloc@plt>
0x00007ffff49a23e3 <+3635>: lea 0x3b8(%rsp),%rdi
0x00007ffff49a23eb <+3643>: callq 0x7ffff4993fd0 <_ZN4core3ptr13drop_in_place17hdc52da31c044fba0E.llvm.13745968970215452383>
0x00007ffff49a23f0 <+3648>: cmpl $0x9,0x3f8(%rsp)
0x00007ffff49a23f8 <+3656>: je 0x7ffff49a2407 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3671>
0x00007ffff49a23fa <+3658>: lea 0x3f8(%rsp),%rdi
0x00007ffff49a2402 <+3666>: callq 0x7ffff49911e0 <_ZN4core3ptr13drop_in_place17h3188964708fd179cE>
0x00007ffff49a2407 <+3671>: mov 0x48(%rsp),%rsi
0x00007ffff49a240c <+3676>: cmp 0x40(%rsp),%rsi
0x00007ffff49a2411 <+3681>: lea 0x120(%rsp),%r14
0x00007ffff49a2419 <+3689>: jne 0x7ffff49a1ff3 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+2627>
0x00007ffff49a241f <+3695>: lea 0x30(%rsp),%rdi
0x00007ffff49a2424 <+3700>: callq 0x7ffff46e2b30 <_ZN71_$LT$alloc..vec..IntoIter$LT$T$GT$$u20$as$u20$core..ops..drop..Drop$GT$4drop17h3943334a18285159E>
0x00007ffff49a2429 <+3705>: lea -0x28(%rbp),%rsp
0x00007ffff49a242d <+3709>: pop %rbx
0x00007ffff49a242e <+3710>: pop %r12
0x00007ffff49a2430 <+3712>: pop %r13
0x00007ffff49a2432 <+3714>: pop %r14
0x00007ffff49a2434 <+3716>: pop %r15
0x00007ffff49a2436 <+3718>: pop %rbp
0x00007ffff49a2437 <+3719>: retq
0x00007ffff49a2438 <+3720>: add $0xffffffffffffffff,%rsi
0x00007ffff49a243c <+3724>: lea 0x31231d(%rip),%rdi # 0x7ffff4cb4760
0x00007ffff49a2443 <+3731>: callq 0x7ffff46bc360 <_ZN4core9panicking18panic_bounds_check17hcc582842c20cc97bE@plt>
0x00007ffff49a2448 <+3736>: jmp 0x7ffff49a245d <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3757>
0x00007ffff49a244a <+3738>: add $0xffffffffffffffff,%r15
0x00007ffff49a244e <+3742>: lea 0x312323(%rip),%rdi # 0x7ffff4cb4778
0x00007ffff49a2455 <+3749>: mov %r15,%rsi
0x00007ffff49a2458 <+3752>: callq 0x7ffff46bc360 <_ZN4core9panicking18panic_bounds_check17hcc582842c20cc97bE@plt>
0x00007ffff49a245d <+3757>: ud2
0x00007ffff49a245f <+3759>: lea 0x312c52(%rip),%rax # 0x7ffff4cb50b8
0x00007ffff49a2466 <+3766>: mov %rax,0x120(%rsp)
0x00007ffff49a246e <+3774>: movq $0x1,0x128(%rsp)
0x00007ffff49a247a <+3786>: movq $0x0,0x130(%rsp)
0x00007ffff49a2486 <+3798>: lea 0x324ab(%rip),%rax # 0x7ffff49d4938 <byte_str.f.llvm.13745968970215452383>
Presuming the same two commits would be responsible for the html5ever regression, I also cannot reproduce it on 18.04 Ubuntu.
Using this dockerfile:
FROM ubuntu:16.04
RUN apt-get update -y && apt-get install -y curl gcc libssl-dev file make pkg-config time git
RUN curl https://sh.rustup.rs | sh -s -- -y
ENV PATH=$PATH:/root/.cargo/bin
RUN cargo install rustup-toolchain-install-master
RUN rustup-toolchain-install-master \
c1168be5360f17516b233be85ebb193bb4e612bf \
a2726846f6d6280b752019472b6bd791c0752006
RUN git clone https://github.com/rust-lang-nursery/rustc-perf
RUN cargo +a2726846f6d6280b752019472b6bd791c0752006 \
fetch --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml
RUN cargo +a2726846f6d6280b752019472b6bd791c0752006 \
build --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml
RUN cargo +c1168be5360f17516b233be85ebb193bb4e612bf \
build --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml
I can reproduce the regression from 10s to 20s on html5ever.
Starting from 18.04 it does indeed not show a regression.
The perf server is indeed 16.04.
Ok so it looks like this is a regression specific to something that changed between glibc 2.23 and 2.27. With the profiles it looks like it's something related to the efficiency of memcpy/memmove. It also seems like this isn't reproducible on OSX (thanks @kennytm). I can, however, reproduce it locally on Windows 64-bit MSVC.
Since @kennytm could reproduce a slightly slowdown on OSX, I wonder if these code paths do something like exercise a ton of moves? It may be the case that glibc optimized one of its routines in the 2.23 -> 2.27 transition (and OSX already had it). In that case it may appear like there's no regression on Linux/OSX but on glibc 2.23 and Windows they may have unoptimized versions of the routines causing slowdowns.
All in all I think that this is a regression that is still worth fixing, despite it not immediately being applicable on glibc 2.27 or OSX. The fact that it affects somewhat-recent versions of Ubuntu and Windows is enough for me.
Isn't glibc 2.27 the version released after the spectre/meltdown/whatever fixes that slow (primarily Intel but to a lesser extent AMD as well) systems down?
No sorry glibc 2.27 is just what I tested. I've now narrowed it down (conveniently) to glibc 2.23 and 2.24. Ubuntu 16.04 ships with 2.23 and 16.10 ships with 2.24. On 2.24 I'm seeing a faster compilation
Perusing the glibc release announcement it looks like this change may be why we're not seeing the regression in newer Linux
Ok I've found the source of the regression, will post a PR soon.
It's a pleasure watching you folks work! :D
I've posted what I believe is a fix to https://github.com/rust-lang/rust/pull/50575
triage: P-high
Has pending fix.
Most helpful comment
No sorry glibc 2.27 is just what I tested. I've now narrowed it down (conveniently) to glibc 2.23 and 2.24. Ubuntu 16.04 ships with 2.23 and 16.10 ships with 2.24. On 2.24 I'm seeing a faster compilation
Perusing the glibc release announcement it looks like this change may be why we're not seeing the regression in newer Linux