The package being built (and all targets within) will not be built in release mode, but all dependencies would be built in release mode.
cc #784
cc https://github.com/rust-lang/cargo/issues/784#issuecomment-76165401
Would this apply to the standard library as well? (I.E. compiling your code in debug mode while the standard lib would be compiled in release mode)
Unfortunately no, the standard library currently only comes as optimized.
Oh, excellent. That's actually what I was hoping would happen, as then most of the bottlenecks are going to be in my code that calls the std lib rather than in the std lib itself.
+1. I want this for Servo ipc-channel.
@alexcrichton I think I can work on this.
From the design point of view, we can add an optimize-dependencies flag, or we can add a separate dev-dependencies profile. The first option is simpler, the second is more flexible and would allow to implement a default like
[profiles.dev-deps]
opt-level = 2
debug = true
debug-assertions = true
which may be better then the current default of slow dependencies.
These two options are somewhat orthogonal, and it should be possible to implement one first and add the other later.
Another design question is what exactly a dependency is? If you have some path dependencies, do you want them to be optimized? And what about overridden dependencies? I can suggest two solutions here.
A package is _not_ a dependency if it is
1) a root package.
2) a package for which source_id().is_path() is true (this includes the root package)
From the implementation point of view I hope that the only thing that should be changed is lib_profile function :)
@matklad awesome! I think that this definitely has a bit of a design aspect to it before charging ahead, although I _think_ you're correct in that the function you highlighted is the only one that needs to change (convenient, eh?).
I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:
[profiles.dev]
dependency-profile = 'release'
or something like that. Basically you select a _profile_ for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.
I also agree about your heuristic about what a dependency is and what isn't. In general I think that all path dependencies are just implementation details (or units of incrementality) that are part of the current crate being worked on. That being said, however, the name "dependency" may be a bit misleading there because they are indeed dependencies.
I wonder if perhaps the term "upstream" could be used? That may help distinguish "other people's code" from "my local code". The term "upstream" and "downstream" are sometimes conflated, though, so that may not be best...
And all _that_ being said you may still want to optimize path dependencies. For example if you temporarily override a perf-critical dependency to a local crate, you probably still want to optimize it even though you're working on it.
I think that I might somewhat lean towards "dependency" meaning "anything not the root package". We could eventually support something where you can specify profiles for specific dependencies in a Cargo.toml, and that could perhaps be used to optimize everything but a few packages.
Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.
Yes, I think it is the best approach.
I wonder if perhaps the term "upstream" could be used?
I find "upstream" even more confusing than "dependency". What about "local package"?
I think that I might somewhat lean towards "dependency" meaning "anything not the root package".
I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.
Out of curiosity, how this will work with monomorphisation? If I have a template function in a dependency compiled in release mode, and use it from a crate compiled in debug mode, will I get any benefits? Suppose that the templated function does not call other non-templated functions.
"local" does seem kinda on the right track, yeah, although it may be kind of odd saying:
[profile.dev]
non-local-dependency-profile = 'release'
Quite a mouthful!
This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into.
Maybe vendor instead of non-local?
I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:
[profiles.dev] dependency-profile = 'release'
Hm, in this piece of TOML two ideas are expressed:
I totally agree with 1, but 2 does not feel right: ideally profile should be set per package, but dependency-profile is global property.
What about
[profile-overrides.dev]
vendor-dependencies="release"
? This can in future be extended to handle more fine-grain overrides.
Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one _already_ needs to be reflected in another, so maybe it's not so bad?
I'd be somewhat wary of inventing new top-level keys like profile-overrides (maybe we could stick it in [project] if we want?). I'm also not sure that "vendor" conveys the right information here because to me "vendor" typically means what's literally included locally, rather than what I'm using from crates.io
Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?
I'd like to ask what exactly a profile is? I mean, there is test profile and release profile, but there is also cargo test and cargo test --release, which do different things.
Here is a model I have in mind for this feature:
A profile is basically a description of a set of flags, applied during compilation _of a package_. And there is "compilation mode" which determines what profiles are applied to what packages. Like, there is --release mode, which applies release profile to all packages.
As far as I understand, currently in cargo the notion of compilation mode is not present first class. For example, in Context build_config.release and unit.profile.test are used to make decisions about compilation. This lead to surprising (?) effects. If I have
[profile.test]
opt-level = 3
in Cargo.toml, then for cargo test only my crate will be build with optimizations enabled, but all dependencies will use flags from the dev profile.
Ah yeah to clarify, when I say profile I mean what you're writing down in Cargo.toml like test, release, dev, and bench. I will agree, however, that Cargo's treatment of these profiles is kinda ad-hoc and in general "not great" as it may be surprising (as you're encountering here).
That being said, though, my motivation of foo = "profile-name" is because we may one day support custom profiles, and otherwise I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?
I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?
Yes, if we specify override globally, like
[project]
profile-overrides = {
upstream-dependencies="profile-name"
}
Likely not, if we use something like
[profiles.test]
profile-overrides = {
upstream-dependencies="profile-name"
}
@alexcrichton what about
I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.
Yeah we could in theory support a top-level "always override dependencies with this profile", but I suspect that it's likely to be a local decision per-profile (especially if we grow custom profiles) rather than an always-true option. Note that the dev/test profile management here may want to be improved...
I can also see the "dependency = anything non local" logic, yeah, although I think it can work both ways. You may _sometimes_ be frobbing all the path dependencies at once, but there could also be legitimate cases where only the main one is being modified. I suspect though that the "anything non local" case is more common.
At least in terms of performance, I would expect any bottlenecked dependencies to all be upstream rather than being worked on locally.
There are some concerns on #2380, outlining few tensions to balance when implementing this.
@alexcrichton I agree with your analysis. One thing I want to add is that there is a possible case of optimizing (O1?) dependencies by default, which still can (maybe) bring some benefits.
And for the future refence I've done some dirty "benchmarks" of the approach in #2380. I've benchmarked this crate (this crate was the reason I found this issue :) ). It does some OpenGL rendering in a super inefficient and unidiomatic way. The major run time bottleneck is texture loading (which is done by a library).
So here are the measurements of a recompile cycle after a trivial whitespace edit in "src/lib.rs" and a corresponding time to see something rendered.
$CARGO build --bin mirror 24.93s user 0.92s system 100% cpu 25.840 total
$CARGO build --bin mirror 25.22s user 0.89s system 100% cpu 26.096 total
./target/debug/mirror 68.11s user 0.24s system 97% cpu 1:10.14 total
./target/debug/mirror 68.26s user 0.22s system 98% cpu 1:09.60 total
$CARGO build --release --bin mirror 72.13s user 0.72s system 100% cpu 1:12.81 total
$CARGO build --release --bin mirror 69.92s user 0.73s system 100% cpu 1:10.61 total
./target/release/mirror 1.59s user 0.21s system 63% cpu 2.836 total
./target/release/mirror 1.66s user 0.20s system 71% cpu 2.602 total
[profile.release]
debug = true
debug-assertions = true
[profile.dev]
dependencies-profile = "release"
$CARGO build --bin mirror 24.13s user 0.71s system 100% cpu 24.832 total
$CARGO build --bin mirror 24.27s user 0.75s system 100% cpu 25.005 total
./target/debug/mirror 25.40s user 0.22s system 97% cpu 26.354 total
./target/debug/mirror 25.38s user 0.23s system 96% cpu 26.536 total
[profile.release]
debug = true
debug-assertions = true
opt-level = 1
[profile.dev]
dependencies-profile = "release"
$CARGO build --bin mirror 24.65s user 0.80s system 100% cpu 25.432 total
$CARGO build --bin mirror 24.58s user 0.84s system 100% cpu 25.404 total
./target/debug/mirror 50.54s user 0.25s system 98% cpu 51.672 total
./target/debug/mirror 50.30s user 0.23s system 98% cpu 51.325 total
[profile.dev]
opt-level = 1
$CARGO build --bin mirror 38.91s user 0.81s system 100% cpu 39.699 total
$CARGO build --bin mirror 38.68s user 0.80s system 100% cpu 39.459 total
./target/debug/mirror 34.61s user 0.26s system 95% cpu 36.404 total
./target/debug/mirror 34.55s user 0.25s system 96% cpu 36.139 total
| | debug | release | deps-release | deps opt-level=1 | all opt-level=1 |
| --- | --- | --- | --- | --- | --- |
| recompile | 25s | 70s | 25s | 25s | 38s |
| run | 68s | 2s | 25s | 51s | 36s |
I kinda like how compile and run time coincide in the third case =)
Thanks for the data @matklad! Out of curiosity, what do the numbers look like if opt-level is set to 1?
@alexcrichton updated the table
Oh sorry, I meant for O1 that the entire graph had opt-level 1, not just the dependencies. Does that reduce the runtime at all?
yep, updated the table.
Thanks! Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?
Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?
Not sure. First of all, I don't think that this benchmarks are really representative (target crate is just some quick an dirty openGL exercises, nothing close to production). In particular, target crate is pretty small, and I suppose that compile time grows lineary with code size, while run time is sub linear. That is, for large code bases, O1 compile time overhead may kill any run time benefits.
And given current largish compile times, I won't feel comfortable sacrificing compile time for anything at all :)
Just want to say that I'd really like to see this or something like it implemented. My use case is building Rust programs for microcontrollers that may have as little persistent storage (Flash memory) as 8 KB. I'm currently mainly working with a device that has 128 KB of Flash and some programs compiled with the dev profile can be as big as ~40 KB whereas in release mode they would be 1-4KB. If I were working with a 8KB device I wouldn't even be able to flash programs compiled with the dev profile into the device ...
Some thoughts:
core crate without optimizations in dev builds and that's going to aggravate all the problems I mentioned above. (Xargo compiles the core crate and the rest of the sysroot with --release)Well, I will arguee that the dependencies should always be compiled in release and only when needed compile with debug
Most of times, I don't care about debug information of my dependencies because I want to debug my program not my dependencies.
Also it will probably impact a little the first impression of newcomers when the default cargo build is not release mode and they see "slowness".
This is very noticeable when using dependencies for build scripts that do heavy work. For example, in Gecko we plan to use bindgen at build time, and compiling it in debug mode is pretty slow. cc @upsuper
and compiling it in debug mode is pretty slow
Not compiling it... but running the build script which uses bindgen is pretty slow.
Another example: decoding ogg/vorbis streams. It can decode in debug mode faster than real time but if you have a game and e.g. want to decode the file before you start playing it for performance reasons/other reasons, then there is a noticeable impact, especially when they compare the performance to native C performance. people actually get confused by this. And that person had the courage to file an issue report, you shouldn't forget all the people who didn't have it.
I personally would like more fine grained control over this
Something like this:
[dependencies]
foo = {path = "../foo", dev-build=release}
bar = {path = "../bar", dev-build=debug}
@MaikKlein yeah, we probably want to completely rethink the whole "profiles" thing to allow tweaking options more easily and predictably. I'll probably try to write a pre-pre-rfc on the topic.
Has there been any progress on this? https://github.com/rust-lang/cargo/pull/1826 got pretty close to creating a solution that'd work for me, unfortunately it was never merged. It was abandoned in favor of https://github.com/rust-lang/cargo/issues/942. Was a solution implemented based on that?
@mkovacs currently, no, there hasn't been much progress on this. This feature remains unimplemented but @matklad sounds like he's got thoughts above
Yeah, this is basically blocked on profiles, and I have not yet written a pre RFC because I don't know what to write in the "proposed solution" section, but I'll link my current take on the problem formulation: https://github.com/rust-lang/cargo/issues/4140#issuecomment-306955783
Could profiles be separated from an overall option to build dependencies as release?
For example, the new default could be debug local code + release crate dependencies with a --debug-all as a counterpart to the existing --release parameter?
I think we should fix the issue in two steps:
Is there any workaround currently to make one of the dependencies to build in release mode (optimized, without debug info)? I have one dependency that is way slower in debug than in release.
@ozkriff @not-fl3 yesterday at Rust gamedev meetup at Saint Petersburg you've both said that this feature would be very useful for gamedev. So I've decided to update my PR so that you can check if this feature indeed brings the benefits you expect (there's a fear that due to monomorphization optimization level for deps might be irrelevant).
Given that we now have unstable cargo features, I think that we could probably land this in some unstable form. For sure, we won't stabilize anything close to my implementation, because profiles need to be fixed first. However, this won't probably happen soon, so, if this is useful, we can try to help at least nightly users. As this only affects the development experience of binary crates authors, I don't think there's a high risk behind adding this under a feature flag.
To try this out, install Cargo from this branch: https://github.com/matklad/cargo/tree/fast-dependencies-2
cp ~/.cargo/bin/cargo ~/.cargo/bin/rustup-cargo
cargo install --git https://github.com/matklad/cargo --branch fast-dependencies-2 --force
Then add to your Cargo.toml (for workspaces this must be a non-virtual manifest):
cargo-features = ["always-optimize-deps"]
always-optimize-deps = true
Perhaps rustc could compile monomorphized code from external crates according to the optimization flags given when those crates were built, distinct from the flags applying to the current crate?
@matklad It works beautifully for my use cases! :-D
@matklad great work!
Made a quick hack https://github.com/not-fl3/cargo/commit/fd2a0b5c97686a4ebc9ff6fa8f393d1fc9103604 to cover all of my needs.
Its a way to force even local dependencies to be built in release mode.
Two use cases I wanted to cover:
For example, local mesh loading crate. Its not on crates.io, so its added as a local lib. But in the same time its not activly developed, so its really nice to compile it once in release.
For example, local server with entrypoint in lib.rs. Nice to have an option to make it fast during development of client.
It would be really great to have something like this in nightly cargo!
I havent read anything about monophormization, but from my point of view the only thing that need to be like This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into. is release mode.
I also support the idea of specifying optimization level for individual dependencies.
I'm developing a program that deals with images, and it's a major pain to test it, because the image crate takes forever to load images when compiled with debug profile.
@Manishearth proposed an RFC: https://github.com/rust-lang/rfcs/pull/2282
As there hasn't been any activity here in over 6 months I've marked this as stale and if no further activity happens for 7 days I will close it.
I'm a bot so this may be in error! If this issue should remain open, could someone (the author, a team member, or any interested party) please comment to that effect?
The team would be especially grateful if such a comment included details such as:
Thank you for contributing!
(The cargo team is currently evaluating the use of Stale bot, and using #6035 as the tracking issue to gather feedback.)
If you're reading this comment from the distant future, fear not if this was closed automatically. If you believe it's still an issue please leave a comment and a team member can reopen this issue. Opening a new issue is also acceptable!
This has been implemented on nightly and is tracked at rust-lang/rust, so I'm going to close in favor of those locations.
@alexcrichton Your "implemented" link goes to the docs for profiles in .cargo/config, which as I understand apply to all crates for a given cargo invocation. So it is not directly relevant to this issue. Are crate-specific profiles implemented too? If not, shouldn鈥檛 this issue stay open?
@SimonSapin I think the link should be: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-overrides The "*" override covers all dependencies.
Oh I see, never mind then :)
I found my way here while wondering if this feature existed, and since this is a years-old thread, looks like the feature is no longer "unstable", and docs are now here: https://doc.rust-lang.org/nightly/cargo/reference/profiles.html#overrides
Most helpful comment
@ozkriff @not-fl3 yesterday at Rust gamedev meetup at Saint Petersburg you've both said that this feature would be very useful for gamedev. So I've decided to update my PR so that you can check if this feature indeed brings the benefits you expect (there's a fear that due to monomorphization optimization level for deps might be irrelevant).
Given that we now have unstable cargo features, I think that we could probably land this in some unstable form. For sure, we won't stabilize anything close to my implementation, because profiles need to be fixed first. However, this won't probably happen soon, so, if this is useful, we can try to help at least nightly users. As this only affects the development experience of binary crates authors, I don't think there's a high risk behind adding this under a feature flag.
To try this out, install Cargo from this branch: https://github.com/matklad/cargo/tree/fast-dependencies-2
Then add to your Cargo.toml (for workspaces this must be a non-virtual manifest):