Rust: Pathological performance in resolve with lots of unused imports in `style` crate

Created on 31 Jul 2017 · 10Comments · Source: rust-lang/rust

A local build of Servo's style crate shows a huge amount of time (46 seconds!) in name resolution. It turns out that basically all of it is calling span_to_snippet, specifically this one. That's a lot of unused imports!

Sure enough if we create a synthetic file locally with this script:

N=20000

echo '#![allow(unused)]' > lots-of-unused.rs
for i in `seq 1 $N`; do
  echo "use foo as foo$i;" >> lots-of-unused.rs
done
echo 'fn foo() {}' >> lots-of-unused.rs
echo 'fn main() {}' >> lots-of-unused.rs


echo > lots-of-used.rs
for i in `seq 1 $N`; do
  echo "use test::foo$i;" >> lots-of-used.rs
done
echo 'mod test {' >> lots-of-used.rs
for i in `seq 1 $N`; do
  echo "pub fn foo$i() {}" >> lots-of-used.rs
done
echo '}' >> lots-of-used.rs
echo 'fn main() {' >> lots-of-used.rs
for i in `seq 1 $N`; do
  echo "foo$i();" >> lots-of-used.rs
done
echo '}' >> lots-of-used.rs

we find:

$ rustc +nightly -V
rustc 1.21.0-nightly (aac223f4f 2017-07-30)
$ sh foo.sh
$ rustc +nightly lots-of-used.rs -Z time-passes 2>&1 | grep 'name resolution'
time: 0.033; rss: 164MB name resolution
$ rustc +nightly lots-of-unused.rs -Z time-passes 2>&1 | grep 'name resolution'
time: 1.139; rss: 136MB name resolution

That's a lot of time spent processing unused imports!

The slowdown here isn't quite the same as the style crate, unfortunately. The style crate only has ~7k unused imports, which apparently takes it 48s in that profile. In any case the intent here is clear, lots of unused imports seem to slow down compilation, and the stylecrate looks to have a lot of unused imports!

C-bug I-compiletime T-compiler

Source

alexcrichton

Most helpful comment

Can do!

alexcrichton on 1 Aug 2017

❤3

All 10 comments

cc @rust-lang/compiler

alexcrichton on 31 Jul 2017

We could address this quickly by just emitting the first 10 unused imports warnings and then warning that there are more unused imports not shown.

jseyfried on 1 Aug 2017

I can never deal with more than 10 lints at a time anyway myself, so seems like a reasonable solution to me!

alexcrichton on 1 Aug 2017

Dumping out a ton of warnings is useful when you're hooked into an IDE and you can see them as e.g. green squiggles under the offending text as opposed to 10k lines on a terminal.

sfackler on 1 Aug 2017

I think the reason span_to_snippet is slow is because it calls ensure_filemap_source_present, which does this:
https://github.com/rust-lang/rust/blob/master/src/libsyntax/codemap.rs#L564

    fn ensure_filemap_source_present(&self, file_map: Rc<FileMap>) -> bool {
        let src = self.file_loader.read_file(Path::new(&file_map.name)).ok();
        return file_map.add_external_src(src)
    }

I don't think we want to read source files multiple times like that.

arielb1 on 1 Aug 2017

🎉1

Thank you for looking into Servo compile times Alex, and great find here! Could you CC me or someone from the Servo team for Servo-related issues? Thanks.

SimonSapin on 1 Aug 2017

❤1

Can do!

alexcrichton on 1 Aug 2017

❤3

In general I've always wanted the lint system to avoid even running in #[allow] blocks.

Currently the lint system runs all lints in parallel on the tree. This is great since it means we only traverse it once, and the cost of the traversal is consolidated. We don't need to reduce that cost itself (which is nice because then we don't have to worry about #[warn] inside #[allow]).

_Most_ lints do not maintain state. The lint framework has the capability to do so, and we use this a couple times in Clippy, but most lints are struct FooPass; impl LateLintPass for FooPass {...}, so they won't notice if check_expr or whatever isn't called in a region of #[allow].

The "allowness" of a lint is tied to where in the tree the lint traversal is, _not_ the span emitted by the lint. It's perfectly possible to #[allow] a block of code and have the lint still be emit there because a different block of code (where it is #[warn]) hit the lint but emitted a span it got from elsewhere. We've had trouble with this before in clipyp iirc.

All of this enables this proposal: If we had a specialized StatelessLateLintPass trait for this, then when we have an #[allow] we can turn off calling lint methods entirely, and when there's a #[warn] turn it on again.

It doesn't even have to be a trait, really, it could just be a defaulted fake-static method (until https://github.com/rust-lang/rfcs/pull/2027 happens) on LateLintPass.

I suspect this will get significant speedups for crates like Style. I _may_ be able to take a crack at this if y'all think this is a good approach.

Manishearth on 1 Aug 2017

It occurs to me that a boolean check per lint each tree node might actually end up being more expensive? I'll have look at this code closer. Might be possible to make this fast by having a second array of trait objects which only contains the active lint passes.

Manishearth on 1 Aug 2017

It seems like we do a large enough amount of work that the whole "vector of active lints" stuff won't really be necessary.

However, I just realized that this bug is not during the lint pass but during the resolution phase (which defers linting to the actual lint pass). This won't be able to help, though we can fix it by doing the span_to_snippet in the actual lint and then making this optimization.

So in that case I suspect this optimization is unnecessary. In general we've rarely had performance issues with lints in clippy (which has way more lints than rustc), and we've been pretty careful to measure impact there.

Still, if folks think this optimization can help, let me know.

Manishearth on 1 Aug 2017

Was this page helpful?

0 / 5 - 0 ratings