cargo 🚀 - Add an option to optimize just dependencies

Would this apply to the standard library as well? (I.E. compiling your code in debug mode while the standard lib would be compiled in release mode)

mystise on 11 Mar 2015

Unfortunately no, the standard library currently only comes as optimized.

alexcrichton on 11 Mar 2015

Oh, excellent. That's actually what I was hoping would happen, as then most of the bottlenecks are going to be in my code that calls the std lib rather than in the std lib itself.

mystise on 11 Mar 2015

+1. I want this for Servo ipc-channel.

pcwalton on 23 Jul 2015

@alexcrichton I think I can work on this.

From the design point of view, we can add an optimize-dependencies flag, or we can add a separate dev-dependencies profile. The first option is simpler, the second is more flexible and would allow to implement a default like

[profiles.dev-deps] 
opt-level = 2 
debug = true 
debug-assertions = true

which may be better then the current default of slow dependencies.

These two options are somewhat orthogonal, and it should be possible to implement one first and add the other later.

Another design question is what exactly a dependency is? If you have some path dependencies, do you want them to be optimized? And what about overridden dependencies? I can suggest two solutions here.

A package is _not_ a dependency if it is

1) a root package.
2) a package for which source_id().is_path() is true (this includes the root package)

From the implementation point of view I hope that the only thing that should be changed is lib_profile function :)

matklad on 7 Feb 2016

👍2

@matklad awesome! I think that this definitely has a bit of a design aspect to it before charging ahead, although I _think_ you're correct in that the function you highlighted is the only one that needs to change (convenient, eh?).

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:

[profiles.dev] 
dependency-profile = 'release'

or something like that. Basically you select a _profile_ for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

I also agree about your heuristic about what a dependency is and what isn't. In general I think that all path dependencies are just implementation details (or units of incrementality) that are part of the current crate being worked on. That being said, however, the name "dependency" may be a bit misleading there because they are indeed dependencies.

I wonder if perhaps the term "upstream" could be used? That may help distinguish "other people's code" from "my local code". The term "upstream" and "downstream" are sometimes conflated, though, so that may not be best...

And all _that_ being said you may still want to optimize path dependencies. For example if you temporarily override a perf-critical dependency to a local crate, you probably still want to optimize it even though you're working on it.

I think that I might somewhat lean towards "dependency" meaning "anything not the root package". We could eventually support something where you can specify profiles for specific dependencies in a Cargo.toml, and that could perhaps be used to optimize everything but a few packages.

alexcrichton on 8 Feb 2016

Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

Yes, I think it is the best approach.

I wonder if perhaps the term "upstream" could be used?

I find "upstream" even more confusing than "dependency". What about "local package"?

I think that I might somewhat lean towards "dependency" meaning "anything not the root package".

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

Out of curiosity, how this will work with monomorphisation? If I have a template function in a dependency compiled in release mode, and use it from a crate compiled in debug mode, will I get any benefits? Suppose that the templated function does not call other non-templated functions.

matklad on 8 Feb 2016

"local" does seem kinda on the right track, yeah, although it may be kind of odd saying:

[profile.dev]
non-local-dependency-profile = 'release'

Quite a mouthful!

This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into.

alexcrichton on 9 Feb 2016

Maybe vendor instead of non-local?

whitequark on 9 Feb 2016

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:
[profiles.dev] 
dependency-profile = 'release'

Hm, in this piece of TOML two ideas are expressed:

specify dependency profile override by name,
specify dependency override in the profile itself.

I totally agree with 1, but 2 does not feel right: ideally profile should be set per package, but dependency-profile is global property.

What about

[profile-overrides.dev]
vendor-dependencies="release"

? This can in future be extended to handle more fine-grain overrides.

matklad on 9 Feb 2016

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one _already_ needs to be reflected in another, so maybe it's not so bad?

I'd be somewhat wary of inventing new top-level keys like profile-overrides (maybe we could stick it in [project] if we want?). I'm also not sure that "vendor" conveys the right information here because to me "vendor" typically means what's literally included locally, rather than what I'm using from crates.io

alexcrichton on 10 Feb 2016

👍1

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?

I'd like to ask what exactly a profile is? I mean, there is test profile and release profile, but there is also cargo test and cargo test --release, which do different things.

Here is a model I have in mind for this feature:

A profile is basically a description of a set of flags, applied during compilation _of a package_. And there is "compilation mode" which determines what profiles are applied to what packages. Like, there is --release mode, which applies release profile to all packages.

As far as I understand, currently in cargo the notion of compilation mode is not present first class. For example, in Context build_config.release and unit.profile.test are used to make decisions about compilation. This lead to surprising (?) effects. If I have

[profile.test] 
opt-level = 3

in Cargo.toml, then for cargo test only my crate will be build with optimizations enabled, but all dependencies will use flags from the dev profile.

matklad on 11 Feb 2016

Ah yeah to clarify, when I say profile I mean what you're writing down in Cargo.toml like test, release, dev, and bench. I will agree, however, that Cargo's treatment of these profiles is kinda ad-hoc and in general "not great" as it may be surprising (as you're encountering here).

That being said, though, my motivation of foo = "profile-name" is because we may one day support custom profiles, and otherwise I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

alexcrichton on 11 Feb 2016

I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

Yes, if we specify override globally, like

[project]
profile-overrides = {
   upstream-dependencies="profile-name"
}

Likely not, if we use something like

[profiles.test]
profile-overrides = {
   upstream-dependencies="profile-name"
}

matklad on 11 Feb 2016

@alexcrichton what about

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

matklad on 11 Feb 2016

Yeah we could in theory support a top-level "always override dependencies with this profile", but I suspect that it's likely to be a local decision per-profile (especially if we grow custom profiles) rather than an always-true option. Note that the dev/test profile management here may want to be improved...

I can also see the "dependency = anything non local" logic, yeah, although I think it can work both ways. You may _sometimes_ be frobbing all the path dependencies at once, but there could also be legitimate cases where only the main one is being modified. I suspect though that the "anything non local" case is more common.

At least in terms of performance, I would expect any bottlenecked dependencies to all be upstream rather than being worked on locally.

alexcrichton on 11 Feb 2016

There are some concerns on #2380, outlining few tensions to balance when implementing this.

alexcrichton on 26 Feb 2016

@alexcrichton I agree with your analysis. One thing I want to add is that there is a possible case of optimizing (O1?) dependencies by default, which still can (maybe) bring some benefits.

And for the future refence I've done some dirty "benchmarks" of the approach in #2380. I've benchmarked this crate (this crate was the reason I found this issue :) ). It does some OpenGL rendering in a super inefficient and unidiomatic way. The major run time bottleneck is texture loading (which is done by a library).

So here are the measurements of a recompile cycle after a trivial whitespace edit in "src/lib.rs" and a corresponding time to see something rendered.

Usual debug build

$CARGO build --bin mirror  24.93s user 0.92s system 100% cpu 25.840 total
$CARGO build --bin mirror  25.22s user 0.89s system 100% cpu 26.096 total
./target/debug/mirror  68.11s user 0.24s system 97% cpu 1:10.14 total
./target/debug/mirror  68.26s user 0.22s system 98% cpu 1:09.60 total

Usual --release build

$CARGO build --release --bin mirror  72.13s user 0.72s system 100% cpu 1:12.81 total
$CARGO build --release --bin mirror  69.92s user 0.73s system 100% cpu 1:10.61 total
./target/release/mirror  1.59s user 0.21s system 63% cpu 2.836 total
./target/release/mirror  1.66s user 0.20s system 71% cpu 2.602 total

Debug build with release deps

[profile.release]
debug = true
debug-assertions = true

[profile.dev]
dependencies-profile = "release"

$CARGO build --bin mirror  24.13s user 0.71s system 100% cpu 24.832 total
$CARGO build --bin mirror  24.27s user 0.75s system 100% cpu 25.005 total
./target/debug/mirror  25.40s user 0.22s system 97% cpu 26.354 total
./target/debug/mirror  25.38s user 0.23s system 96% cpu 26.536 total

Debug build with opt-level=1 deps

[profile.release]
debug = true
debug-assertions = true
opt-level = 1

[profile.dev]
dependencies-profile = "release"

$CARGO build --bin mirror  24.65s user 0.80s system 100% cpu 25.432 total
$CARGO build --bin mirror  24.58s user 0.84s system 100% cpu 25.404 total
./target/debug/mirror  50.54s user 0.25s system 98% cpu 51.672 total
./target/debug/mirror  50.30s user 0.23s system 98% cpu 51.325 total

Debug build with opt-level=1 for everything

[profile.dev]
opt-level = 1

$CARGO build --bin mirror  38.91s user 0.81s system 100% cpu 39.699 total
$CARGO build --bin mirror  38.68s user 0.80s system 100% cpu 39.459 total
./target/debug/mirror  34.61s user 0.26s system 95% cpu 36.404 total
./target/debug/mirror  34.55s user 0.25s system 96% cpu 36.139 total

| | debug | release | deps-release | deps opt-level=1 | all opt-level=1 |
| --- | --- | --- | --- | --- | --- |
| recompile | 25s | 70s | 25s | 25s | 38s |
| run | 68s | 2s | 25s | 51s | 36s |

I kinda like how compile and run time coincide in the third case =)

matklad on 27 Feb 2016

Thanks for the data @matklad! Out of curiosity, what do the numbers look like if opt-level is set to 1?

alexcrichton on 29 Feb 2016

@alexcrichton updated the table

matklad on 29 Feb 2016

Oh sorry, I meant for O1 that the entire graph had opt-level 1, not just the dependencies. Does that reduce the runtime at all?

alexcrichton on 29 Feb 2016

yep, updated the table.

matklad on 1 Mar 2016

Thanks! Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

alexcrichton on 1 Mar 2016

Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

Not sure. First of all, I don't think that this benchmarks are really representative (target crate is just some quick an dirty openGL exercises, nothing close to production). In particular, target crate is pretty small, and I suppose that compile time grows lineary with code size, while run time is sub linear. That is, for large code bases, O1 compile time overhead may kill any run time benefits.

And given current largish compile times, I won't feel comfortable sacrificing compile time for anything at all :)

matklad on 1 Mar 2016

Just want to say that I'd really like to see this or something like it implemented. My use case is building Rust programs for microcontrollers that may have as little persistent storage (Flash memory) as 8 KB. I'm currently mainly working with a device that has 128 KB of Flash and some programs compiled with the dev profile can be as big as ~40 KB whereas in release mode they would be 1-4KB. If I were working with a 8KB device I wouldn't even be able to flash programs compiled with the dev profile into the device ...

Some thoughts:

I can't just always use release mode because I want to be able to debug my program. Executing my program statement by statement inside GDB is how I usually debug my programs and release builds have awful source maps because of inlining.
Most of times, I don't care about debug information of my dependencies because I want to debug my program not my dependencies.
Run time of programs compiled with the dev profile is horrible, no surprises there. If I were doing more time sensitive stuff then the dev profile would probably be unusable. Being able to compile the dependencies with O3 will probably improve run time a lot.
If this doesn't get implemented by the time Cargo becomes "std-aware", I expect that people will continue using Xargo instead of Cargo because "std-aware" Cargo will compile the core crate without optimizations in dev builds and that's going to aggravate all the problems I mentioned above. (Xargo compiles the core crate and the rest of the sysroot with --release)

japaric on 2 Nov 2016

❤8

Well, I will arguee that the dependencies should always be compiled in release and only when needed compile with debug

Most of times, I don't care about debug information of my dependencies because I want to debug my program not my dependencies.

Also it will probably impact a little the first impression of newcomers when the default cargo build is not release mode and they see "slowness".

tyoc213 on 21 Nov 2016

This is very noticeable when using dependencies for build scripts that do heavy work. For example, in Gecko we plan to use bindgen at build time, and compiling it in debug mode is pretty slow. cc @upsuper

emilio on 12 Dec 2016

👍1

and compiling it in debug mode is pretty slow

Not compiling it... but running the build script which uses bindgen is pretty slow.

upsuper on 15 Dec 2016

Another example: decoding ogg/vorbis streams. It can decode in debug mode faster than real time but if you have a game and e.g. want to decode the file before you start playing it for performance reasons/other reasons, then there is a noticeable impact, especially when they compare the performance to native C performance. people actually get confused by this. And that person had the courage to file an issue report, you shouldn't forget all the people who didn't have it.

est31 on 11 Apr 2017

👍4

I personally would like more fine grained control over this

Something like this:

[dependencies]
foo = {path = "../foo", dev-build=release}
bar = {path = "../bar", dev-build=debug}

MaikKlein on 9 May 2017

👍2

@MaikKlein yeah, we probably want to completely rethink the whole "profiles" thing to allow tweaking options more easily and predictably. I'll probably try to write a pre-pre-rfc on the topic.

matklad on 10 May 2017

❤5

Has there been any progress on this? https://github.com/rust-lang/cargo/pull/1826 got pretty close to creating a solution that'd work for me, unfortunately it was never merged. It was abandoned in favor of https://github.com/rust-lang/cargo/issues/942. Was a solution implemented based on that?

mkovacs on 11 Jun 2017

@mkovacs currently, no, there hasn't been much progress on this. This feature remains unimplemented but @matklad sounds like he's got thoughts above

alexcrichton on 13 Jun 2017

Yeah, this is basically blocked on profiles, and I have not yet written a pre RFC because I don't know what to write in the "proposed solution" section, but I'll link my current take on the problem formulation: https://github.com/rust-lang/cargo/issues/4140#issuecomment-306955783

matklad on 13 Jun 2017

Could profiles be separated from an overall option to build dependencies as release?

For example, the new default could be debug local code + release crate dependencies with a --debug-all as a counterpart to the existing --release parameter?

mqudsi on 16 Jul 2017

I think we should fix the issue in two steps:

First, provide an unstable option to support this feature somehow
Second, work on profiles 2.0 which hopefully will support something like this, deprecating and removing the unstable option

est31 on 16 Jul 2017

👍5

Is there any workaround currently to make one of the dependencies to build in release mode (optimized, without debug info)? I have one dependency that is way slower in debug than in release.

chyvonomys on 12 Sep 2017

@ozkriff @not-fl3 yesterday at Rust gamedev meetup at Saint Petersburg you've both said that this feature would be very useful for gamedev. So I've decided to update my PR so that you can check if this feature indeed brings the benefits you expect (there's a fear that due to monomorphization optimization level for deps might be irrelevant).

Given that we now have unstable cargo features, I think that we could probably land this in some unstable form. For sure, we won't stabilize anything close to my implementation, because profiles need to be fixed first. However, this won't probably happen soon, so, if this is useful, we can try to help at least nightly users. As this only affects the development experience of binary crates authors, I don't think there's a high risk behind adding this under a feature flag.

To try this out, install Cargo from this branch: https://github.com/matklad/cargo/tree/fast-dependencies-2

cp ~/.cargo/bin/cargo ~/.cargo/bin/rustup-cargo
cargo install --git https://github.com/matklad/cargo --branch fast-dependencies-2 --force

Then add to your Cargo.toml (for workspaces this must be a non-virtual manifest):

cargo-features = ["always-optimize-deps"]
always-optimize-deps = true

matklad on 15 Sep 2017

❤10 👍7 🎉6

Perhaps rustc could compile monomorphized code from external crates according to the optimization flags given when those crates were built, distinct from the flags applying to the current crate?

Ralith on 15 Sep 2017

@matklad It works beautifully for my use cases! :-D

ozkriff on 15 Sep 2017

@matklad great work!

Made a quick hack https://github.com/not-fl3/cargo/commit/fd2a0b5c97686a4ebc9ff6fa8f393d1fc9103604 to cover all of my needs.

Its a way to force even local dependencies to be built in release mode.

Two use cases I wanted to cover:

marking "force-release" local slow sub-crates to improve runtime speed of debug builds.

For example, local mesh loading crate. Its not on crates.io, so its added as a local lib. But in the same time its not activly developed, so its really nice to compile it once in release.

marking "force-release" local crates, which actually is a separate app running as a child process. Helps a lot in testing.

For example, local server with entrypoint in lib.rs. Nice to have an option to make it fast during development of client.

It would be really great to have something like this in nightly cargo!

not-fl3 on 16 Sep 2017

👍1

I havent read anything about monophormization, but from my point of view the only thing that need to be like This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into. is release mode.

tyoc213 on 16 Sep 2017

I also support the idea of specifying optimization level for individual dependencies.

I'm developing a program that deals with images, and it's a major pain to test it, because the image crate takes forever to load images when compiled with debug profile.

crumblingstatue on 18 Nov 2017

👍6

@Manishearth proposed an RFC: https://github.com/rust-lang/rfcs/pull/2282

luser on 18 Jan 2018

❤3

As there hasn't been any activity here in over 6 months I've marked this as stale and if no further activity happens for 7 days I will close it.

I'm a bot so this may be in error! If this issue should remain open, could someone (the author, a team member, or any interested party) please comment to that effect?

The team would be especially grateful if such a comment included details such as:

Is this still relevant?
If so, what is blocking it?
Is it known what could be done to help move this forward?

Thank you for contributing!

(The cargo team is currently evaluating the use of Stale bot, and using #6035 as the tracking issue to gather feedback.)

If you're reading this comment from the distant future, fear not if this was closed automatically. If you believe it's still an issue please leave a comment and a team member can reopen this issue. Opening a new issue is also acceptable!

stale[bot] on 17 Sep 2018

This has been implemented on nightly and is tracked at rust-lang/rust, so I'm going to close in favor of those locations.

alexcrichton on 17 Sep 2018

🎉4 👍1

@alexcrichton Your "implemented" link goes to the docs for profiles in .cargo/config, which as I understand apply to all crates for a given cargo invocation. So it is not directly relevant to this issue. Are crate-specific profiles implemented too? If not, shouldn’t this issue stay open?

SimonSapin on 17 Sep 2018

@SimonSapin I think the link should be: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-overrides The "*" override covers all dependencies.

ehuss on 17 Sep 2018

👍1

Oh I see, never mind then :)

SimonSapin on 17 Sep 2018

I found my way here while wondering if this feature existed, and since this is a years-old thread, looks like the feature is no longer "unstable", and docs are now here: https://doc.rust-lang.org/nightly/cargo/reference/profiles.html#overrides

dylemma on 26 Apr 2020

👍6 🎉1

Cargo: Add an option to optimize just dependencies

Most helpful comment

All 50 comments

Usual debug build

Usual --release build

Debug build with release deps

Debug build with opt-level=1 deps

Debug build with opt-level=1 for everything

Related issues