Rust: Severe regression in html5ever build time

Created on 7 May 2018  路  24Comments  路  Source: rust-lang/rust

The build time of html5ever has gone up by more than 70% for debug builds recently:
http://perf.rust-lang.org/compare.html?start=4745092d608e65ec869c0ebdb27c535f27606ea4&end=6a87289fa4f49b6bdd62f33f69a580026223421f&stat=instructions:u

There are quite a few candidates for introducing this regression in the changeset:
https://github.com/rust-lang/rust/compare/4745092d608e65ec869c0ebdb27c535f27606ea4...6a87289fa4f49b6bdd62f33f69a580026223421f

Would anybody from @rust-lang/wg-compiler-performance or @rust-lang/infra care to bisect further?

I-compiletime O-linux P-high T-compiler WG-compiler-performance regression-from-stable-to-beta

Most helpful comment

No sorry glibc 2.27 is just what I tested. I've now narrowed it down (conveniently) to glibc 2.23 and 2.24. Ubuntu 16.04 ships with 2.23 and 16.10 ships with 2.24. On 2.24 I'm seeing a faster compilation

Perusing the glibc release announcement it looks like this change may be why we're not seeing the regression in newer Linux

All 24 comments

The "tuple stress" benchmark is affected even more while most of the other things have improved. Maybe that gives a hint?

My first guess would be quadratic or exponential blowup. Maybe someone can do a perf focus?

Yes, good idea!

Bisection for tuple-stress points to #50282 (cc @fitzgen)

Before that PR rustc main.rs took ~4s and afterwards I killed it after 2m as I didn't want to wait for it to finish

That may also be the source of the html5ever regression. cargo rustc --lib -- -Z time-passes takes 10s before that commit (the entire cargo rustc time) and 21s afterwards.

Thanks, @alexcrichton! cc also @oli-obk & @eddyb

@alexcrichton Could you post instructions for replicating the slow down? Neither @nnethercote or I could reproduce it -- despite running perf on the appropriate commit range locally.

So should we revert that PR's commits?

@Mark-Simulacrum sure yeah, using rustup-toolchain-install-master I did the equivalent of:

$ rustup-toolchain-install-master c1168be5360f17516b233be85ebb193bb4e612bf a2726846f6d6280b752019472b6bd791c0752006
$ pwd
/home/alex/code/rustc-perf
$ git rev-parse HEAD
254d28f6181cd5d20cedcd3fa9ae36df847da958
$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 collector/benchmarks/tuple-stress/src/main.rs
rustc +a2726846f6d6280b752019472b6bd791c0752006   4.58s user 0.72s system 100% cpu 5.264 total
$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf collector/benchmarks/tuple-stress/src/main.rs
^C
rustc +c1168be5360f17516b233be85ebb193bb4e612bf   109.70s user 0.60s system 99% cpu 1:50.31 total

We should probably only run the disaggregation pass if the opt level is > 1. That's what's done for the copy prop pass already

@fitzgen assuming @Mark-Simulacrum and @nnethercote can reproduce I think yeah we'll probably want to start off by reverting and possibly doing what @oli-obk suggests in a follow-up

I cannot reproduce, which is odd, since I can't tell what the difference could be that would make this much of a difference...

perf.rlo$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 collector/benchmarks/tuple-stress/src/main.rs

real    0m4.688s
user    0m4.559s
sys     0m0.160s
perf.rlo$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf collector/benchmarks/tuple-stress/src/main.rs

real    0m4.852s
user    0m4.659s
sys     0m0.219s

No repro on macOS, although the c116 version is consistently slower than a272.

~/Downloads:92$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs

real    0m6.634s
user    0m6.154s
sys 0m0.474s

~/Downloads:93$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs

real    0m6.171s
user    0m5.848s
sys 0m0.353s

~/Downloads:94$ time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs

real    0m6.492s
user    0m6.083s
sys 0m0.434s

~/Downloads:95$ time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs

real    0m6.178s
user    0m5.848s
sys 0m0.360s

Ok I swear I'm not going crazy! I think though given this data it may only be reproducible on Ubuntu 16.04? I can reproduce with this Dockerfile:

FROM ubuntu:16.04

RUN apt-get update -y && apt-get install -y curl gcc libssl-dev file make pkg-config time

RUN curl https://sh.rustup.rs | sh -s -- -y

ENV PATH=$PATH:/root/.cargo/bin
RUN cargo install rustup-toolchain-install-master
RUN rustup-toolchain-install-master \
  c1168be5360f17516b233be85ebb193bb4e612bf  \
  a2726846f6d6280b752019472b6bd791c0752006

RUN curl -Lo main.rs \
  https://github.com/rust-lang-nursery/rustc-perf/raw/254d28f6181cd5d20cedcd3fa9ae36df847da958/collector/benchmarks/tuple-stress/src/main.rs

RUN time rustc +a2726846f6d6280b752019472b6bd791c0752006 main.rs
RUN time rustc +c1168be5360f17516b233be85ebb193bb4e612bf main.rs

Can y'all reproduce the slowdown for html5ever?

Sure enough starting the docker image from 18.04 means I cannot reproduce the regression.

Maybe some glibc bug was fixed which I was accidentally running into? The html5ever regression though is still worrisome and seems like it may not be connected to a glibc bug

馃憤 I can repro the tuple-stress slow down on 16.04.4 LTS.

-Ztime-passes indicates it is stuck at the translation item collection step.

GDB backtrace sample

#0  __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:325
#1  0x00007ffff49a23b3 in rustc::mir::BasicBlockData::expand_statements::hf1d4ebe84ecd6e03 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#2  0x00007ffff48b9bb5 in _$LT$rustc_mir..transform..deaggregator..Deaggregator$u20$as$u20$rustc_mir..transform..MirPass$GT$::run_pass::hbb873b9298ecbcc7 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#3  0x00007ffff4900f48 in rustc_mir::transform::optimized_mir::_$u7b$$u7b$closure$u7d$$u7d$::h31fd21092d0543e7 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#4  0x00007ffff48ea8d1 in rustc_mir::transform::optimized_mir::h1131a13b2ae69a43 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#5  0x00007ffff3cdb059 in rustc::ty::maps::_$LT$impl$u20$rustc..ty..maps..config..QueryConfig$LT$$u27$tcx$GT$$u20$for$u20$rustc..ty..maps..queries..optimized_mir$LT$$u27$tcx$GT$$GT$::compute::hc1ffed4360780ff2 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#6  0x00007ffff37e2a31 in rustc::dep_graph::graph::DepGraph::with_task_impl::h3ce3074a9b38daa4 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#7  0x00007ffff39a83d7 in rustc::ty::context::tls::with_related_context::h2449d2461fb7e66f () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#8  0x00007ffff3a301a2 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h6c5f6a6ee29b4c8c ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#9  0x00007ffff3a9fa43 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h6b180b58d3e77599 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#10 0x00007ffff3b0d17b in rustc::ty::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::maybe_optimized_mir::hbb43c33ed1c2e50f ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#11 0x00007ffff492c71b in _$LT$rustc_mir..interpret..eval_context..EvalContext$LT$$u27$a$C$$u20$$u27$mir$C$$u20$$u27$tcx$C$$u20$M$GT$$GT$::load_mir::hdadbb8aaab5a0add ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#12 0x00007ffff4775200 in rustc_mir::interpret::const_eval::eval_body_and_ecx::hb4b25a904afc078e () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#13 0x00007ffff4779068 in rustc_mir::interpret::const_eval::const_eval_provider::he924d56230e7faca () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#14 0x00007ffff3cdbb57 in rustc::ty::maps::_$LT$impl$u20$rustc..ty..maps..config..QueryConfig$LT$$u27$tcx$GT$$u20$for$u20$rustc..ty..maps..queries..const_eval$LT$$u27$tcx$GT$$GT$::compute::h8881ebdb8a6b3fc9 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#15 0x00007ffff37f9789 in rustc::dep_graph::graph::DepGraph::with_task_impl::h970e6f3457ee9779 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#16 0x00007ffff39fd5cf in rustc::ty::context::tls::with_related_context::hf6aa86ba669caacf () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#17 0x00007ffff3a315d8 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h767d969b981cd0e1 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#18 0x00007ffff3a898a6 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h402310c1d566e53b ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#19 0x00007ffff3b0eacd in rustc::ty::maps::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$tcx$C$$u20$$u27$lcx$GT$$GT$::const_eval::h0dc1da25520f77c1 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#20 0x00007ffff4868556 in rustc_mir::monomorphize::collector::collect_items_rec::h62a919462e2120d0 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#21 0x00007ffff4867e82 in rustc_mir::monomorphize::collector::collect_crate_mono_items::h95367eff4f34f085 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc_mir-6950d58feb41fa53.so
#22 0x00007fffed1dd4ba in rustc::util::common::time::h2e056e8595b0c79b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#23 0x00007fffed2a02c9 in rustc_trans::base::collect_and_partition_translation_items::h675aeb9c6f31a5ac ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#24 0x00007ffff37da225 in rustc::dep_graph::graph::DepGraph::with_task_impl::h2495eb20a211973b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#25 0x00007ffff39e38fc in rustc::ty::context::tls::with_related_context::hb6f352ea70136898 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#26 0x00007ffff3a231f0 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::force_query_with_job::h20d94168c3140725 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#27 0x00007ffff3a8f472 in rustc::ty::maps::plumbing::_$LT$impl$u20$rustc..ty..context..TyCtxt$LT$$u27$a$C$$u20$$u27$gcx$C$$u20$$u27$tcx$GT$$GT$::get_query::h47577a0ffde54cdb ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/../lib/librustc-6264c07cbba9680a.so
#28 0x00007fffed29e541 in rustc_trans::base::trans_crate::hee88ee89ffeda7af () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#29 0x00007fffed201440 in _$LT$rustc_trans..LlvmTransCrate$u20$as$u20$rustc_trans_utils..trans_crate..TransCrate$GT$::trans_crate::h800dcda488afb790 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends/librustc_trans-llvm.so
#30 0x00007ffff7a2ed36 in rustc::util::common::time::hcf488eceeda0a9cc () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#31 0x00007ffff7a1ea4d in rustc_driver::driver::phase_4_translate_to_llvm::h13463ad7ae997e4b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#32 0x00007ffff7ac3016 in rustc_driver::driver::compile_input::_$u7b$$u7b$closure$u7d$$u7d$::h31e9ff1f64156b74 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#33 0x00007ffff7ab6af5 in rustc::ty::context::tls::enter_context::hb86cbd2c401dfb18 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#34 0x00007ffff7a818be in _$LT$std..thread..local..LocalKey$LT$T$GT$$GT$::with::h19074a28b3607093 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#35 0x00007ffff79ceb5c in rustc::ty::context::TyCtxt::create_and_enter::h3499c83bc74e5346 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#36 0x00007ffff7a17a0b in rustc_driver::driver::compile_input::ha789cddfb4d7cf8b () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#37 0x00007ffff7ad5735 in rustc_driver::run_compiler_impl::he0bf21247a724f12 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#38 0x00007ffff7a36cf5 in _$LT$scoped_tls..ScopedKey$LT$T$GT$$GT$::set::h0a4a9cc70840e625 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#39 0x00007ffff79eae95 in syntax::with_globals::h426e57a0d566387f () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#40 0x00007ffff79dde2e in _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h31900b1f5e62de36 ()
   from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#41 0x00007ffff766ff7a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#42 0x00007ffff7ad2814 in rustc_driver::run::hbcc80b82a59ea630 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#43 0x00007ffff7ae02bb in rustc_driver::main::hd5d5ac3338638be9 () from /home/kennytm/.rustup/toolchains/c1168be5360f17516b233be85ebb193bb4e612bf/bin/../lib/librustc_driver-faeae2bec58daae3.so
#44 0x0000555555554c13 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hbdc018091d795d35 ()
#45 0x00007ffff7630073 in std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h0e173cc915a5091f () at libstd/rt.rs:59
#46 std::panicking::try::do_call::h40752f40c51cf13d () at libstd/panicking.rs:310
#47 0x00007ffff766ff7a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#48 0x00007ffff7645cf6 in std::panicking::try::h431a7a5b8edb0415 () at libstd/panicking.rs:289
#49 std::panic::catch_unwind::h1eb313d318d299fa () at libstd/panic.rs:374
#50 std::rt::lang_start_internal::h83a5cbcfc3241381 () at libstd/rt.rs:58
#51 0x0000555555554c74 in main ()

Code surrounding the call of that memmove

   0x00007ffff49a2303 <+3411>:  je     0x7ffff49a237b <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3531>
   0x00007ffff49a2305 <+3413>:  mov    0x100(%rsp),%rax
   0x00007ffff49a230d <+3421>:  mov    %rax,0xa0(%rsp)
   0x00007ffff49a2315 <+3429>:  movaps 0xf0(%rsp),%xmm0
   0x00007ffff49a231d <+3437>:  movaps %xmm0,0x90(%rsp)
   0x00007ffff49a2325 <+3445>:  movaps 0xb0(%rsp),%xmm0
   0x00007ffff49a232d <+3453>:  movaps 0xc0(%rsp),%xmm1
   0x00007ffff49a2335 <+3461>:  movdqa 0xd0(%rsp),%xmm2
   0x00007ffff49a233e <+3470>:  movaps 0xe0(%rsp),%xmm3
   0x00007ffff49a2346 <+3478>:  movaps %xmm3,0x80(%rsp)
   0x00007ffff49a234e <+3486>:  movdqa %xmm2,0x70(%rsp)
   0x00007ffff49a2354 <+3492>:  movaps %xmm1,0x60(%rsp)
   0x00007ffff49a2359 <+3497>:  movaps %xmm0,0x50(%rsp)
   0x00007ffff49a235e <+3502>:  mov    %r12,%rdi
   0x00007ffff49a2361 <+3505>:  callq  0x7ffff46c8880 <_ZN4core3ptr13drop_in_place17h3188964708fd179cE.llvm.12071625485570946806>
   0x00007ffff49a2366 <+3510>:  jmpq   0x7ffff49a2290 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3296>
   0x00007ffff49a236b <+3515>:  nopl   0x0(%rax,%rax,1)
   0x00007ffff49a2370 <+3520>:  movl   $0x9,0xb0(%rsp)
   0x00007ffff49a237b <+3531>:  mov    0x368(%rsp),%rax
   0x00007ffff49a2383 <+3539>:  test   %rax,%rax
   0x00007ffff49a2386 <+3542>:  je     0x7ffff49a23c4 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3604>
   0x00007ffff49a2388 <+3544>:  mov    0x380(%rsp),%r15
   0x00007ffff49a2390 <+3552>:  mov    (%r15),%rcx
   0x00007ffff49a2393 <+3555>:  mov    0x10(%r15),%rbx
   0x00007ffff49a2397 <+3559>:  imul   $0x58,0x360(%rsp),%rsi
   0x00007ffff49a23a0 <+3568>:  add    %rcx,%rsi
   0x00007ffff49a23a3 <+3571>:  imul   $0x58,%rbx,%rdi
   0x00007ffff49a23a7 <+3575>:  add    %rcx,%rdi
   0x00007ffff49a23aa <+3578>:  imul   $0x58,%rax,%rdx
   0x00007ffff49a23ae <+3582>:  callq  0x7ffff46be350 <memmove@plt>
=> 0x00007ffff49a23b3 <+3587>:  add    0x368(%rsp),%rbx
   0x00007ffff49a23bb <+3595>:  mov    %rbx,0x10(%r15)
   0x00007ffff49a23bf <+3599>:  mov    0x28(%rsp),%r15
   0x00007ffff49a23c4 <+3604>:  mov    %r14,%rdi
   0x00007ffff49a23c7 <+3607>:  callq  0x7ffff46e3080 <_ZN71_$LT$alloc..vec..IntoIter$LT$T$GT$$u20$as$u20$core..ops..drop..Drop$GT$4drop17h62e13b79d8b9a38aE>
   0x00007ffff49a23cc <+3612>:  mov    0x3b0(%rsp),%rdi
   0x00007ffff49a23d4 <+3620>:  mov    $0x38,%esi
   0x00007ffff49a23d9 <+3625>:  mov    $0x8,%edx
   0x00007ffff49a23de <+3630>:  callq  0x7ffff46bbd40 <__rust_dealloc@plt>
   0x00007ffff49a23e3 <+3635>:  lea    0x3b8(%rsp),%rdi
   0x00007ffff49a23eb <+3643>:  callq  0x7ffff4993fd0 <_ZN4core3ptr13drop_in_place17hdc52da31c044fba0E.llvm.13745968970215452383>
   0x00007ffff49a23f0 <+3648>:  cmpl   $0x9,0x3f8(%rsp)
   0x00007ffff49a23f8 <+3656>:  je     0x7ffff49a2407 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3671>
   0x00007ffff49a23fa <+3658>:  lea    0x3f8(%rsp),%rdi
   0x00007ffff49a2402 <+3666>:  callq  0x7ffff49911e0 <_ZN4core3ptr13drop_in_place17h3188964708fd179cE>
   0x00007ffff49a2407 <+3671>:  mov    0x48(%rsp),%rsi
   0x00007ffff49a240c <+3676>:  cmp    0x40(%rsp),%rsi
   0x00007ffff49a2411 <+3681>:  lea    0x120(%rsp),%r14
   0x00007ffff49a2419 <+3689>:  jne    0x7ffff49a1ff3 <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+2627>
   0x00007ffff49a241f <+3695>:  lea    0x30(%rsp),%rdi
   0x00007ffff49a2424 <+3700>:  callq  0x7ffff46e2b30 <_ZN71_$LT$alloc..vec..IntoIter$LT$T$GT$$u20$as$u20$core..ops..drop..Drop$GT$4drop17h3943334a18285159E>
   0x00007ffff49a2429 <+3705>:  lea    -0x28(%rbp),%rsp
   0x00007ffff49a242d <+3709>:  pop    %rbx
   0x00007ffff49a242e <+3710>:  pop    %r12
   0x00007ffff49a2430 <+3712>:  pop    %r13
   0x00007ffff49a2432 <+3714>:  pop    %r14
   0x00007ffff49a2434 <+3716>:  pop    %r15
   0x00007ffff49a2436 <+3718>:  pop    %rbp
   0x00007ffff49a2437 <+3719>:  retq   
   0x00007ffff49a2438 <+3720>:  add    $0xffffffffffffffff,%rsi
   0x00007ffff49a243c <+3724>:  lea    0x31231d(%rip),%rdi        # 0x7ffff4cb4760
   0x00007ffff49a2443 <+3731>:  callq  0x7ffff46bc360 <_ZN4core9panicking18panic_bounds_check17hcc582842c20cc97bE@plt>
   0x00007ffff49a2448 <+3736>:  jmp    0x7ffff49a245d <_ZN5rustc3mir14BasicBlockData17expand_statements17hf1d4ebe84ecd6e03E+3757>
   0x00007ffff49a244a <+3738>:  add    $0xffffffffffffffff,%r15
   0x00007ffff49a244e <+3742>:  lea    0x312323(%rip),%rdi        # 0x7ffff4cb4778
   0x00007ffff49a2455 <+3749>:  mov    %r15,%rsi
   0x00007ffff49a2458 <+3752>:  callq  0x7ffff46bc360 <_ZN4core9panicking18panic_bounds_check17hcc582842c20cc97bE@plt>
   0x00007ffff49a245d <+3757>:  ud2    
   0x00007ffff49a245f <+3759>:  lea    0x312c52(%rip),%rax        # 0x7ffff4cb50b8
   0x00007ffff49a2466 <+3766>:  mov    %rax,0x120(%rsp)
   0x00007ffff49a246e <+3774>:  movq   $0x1,0x128(%rsp)
   0x00007ffff49a247a <+3786>:  movq   $0x0,0x130(%rsp)
   0x00007ffff49a2486 <+3798>:  lea    0x324ab(%rip),%rax        # 0x7ffff49d4938 <byte_str.f.llvm.13745968970215452383>

Presuming the same two commits would be responsible for the html5ever regression, I also cannot reproduce it on 18.04 Ubuntu.

Using this dockerfile:

FROM ubuntu:16.04

RUN apt-get update -y && apt-get install -y curl gcc libssl-dev file make pkg-config time git

RUN curl https://sh.rustup.rs | sh -s -- -y

ENV PATH=$PATH:/root/.cargo/bin
RUN cargo install rustup-toolchain-install-master
RUN rustup-toolchain-install-master \
  c1168be5360f17516b233be85ebb193bb4e612bf  \
  a2726846f6d6280b752019472b6bd791c0752006

RUN git clone https://github.com/rust-lang-nursery/rustc-perf

RUN cargo +a2726846f6d6280b752019472b6bd791c0752006 \
  fetch --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml
RUN cargo +a2726846f6d6280b752019472b6bd791c0752006 \
  build --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml
RUN cargo +c1168be5360f17516b233be85ebb193bb4e612bf \
build --manifest-path rustc-perf/collector/benchmarks/html5ever/Cargo.toml

I can reproduce the regression from 10s to 20s on html5ever.

Starting from 18.04 it does indeed not show a regression.

The perf server is indeed 16.04.

Ok so it looks like this is a regression specific to something that changed between glibc 2.23 and 2.27. With the profiles it looks like it's something related to the efficiency of memcpy/memmove. It also seems like this isn't reproducible on OSX (thanks @kennytm). I can, however, reproduce it locally on Windows 64-bit MSVC.

Since @kennytm could reproduce a slightly slowdown on OSX, I wonder if these code paths do something like exercise a ton of moves? It may be the case that glibc optimized one of its routines in the 2.23 -> 2.27 transition (and OSX already had it). In that case it may appear like there's no regression on Linux/OSX but on glibc 2.23 and Windows they may have unoptimized versions of the routines causing slowdowns.

All in all I think that this is a regression that is still worth fixing, despite it not immediately being applicable on glibc 2.27 or OSX. The fact that it affects somewhat-recent versions of Ubuntu and Windows is enough for me.

Isn't glibc 2.27 the version released after the spectre/meltdown/whatever fixes that slow (primarily Intel but to a lesser extent AMD as well) systems down?

No sorry glibc 2.27 is just what I tested. I've now narrowed it down (conveniently) to glibc 2.23 and 2.24. Ubuntu 16.04 ships with 2.23 and 16.10 ships with 2.24. On 2.24 I'm seeing a faster compilation

Perusing the glibc release announcement it looks like this change may be why we're not seeing the regression in newer Linux

Ok I've found the source of the regression, will post a PR soon.

It's a pleasure watching you folks work! :D

I've posted what I believe is a fix to https://github.com/rust-lang/rust/pull/50575

triage: P-high

Has pending fix.

Was this page helpful?
0 / 5 - 0 ratings