cargo has to recompile dependencies once a toolchain has updated/changed, this leaves all previous artifacts intact. However, this means when you're updating toolchains the target dir will just grow and grow.
Currently you can cargo clean and rebuild, but it would be nice to be able to clean only the dependencies that are not compatible for the current toolchain. Perhaps cargo clean --incompatible ?
For example building a small crate
# cargo +nightly-2018-02-07 build; du -hd2 ./target
57M ./target/debug/deps
4.0K ./target/debug/examples
528K ./target/debug/.fingerprint
4.0K ./target/debug/native
18M ./target/debug/incremental
7.6M ./target/debug/build
82M ./target/debug
82M ./target
# cargo +nightly-2018-02-08 build; du -hd2 ./target
113M ./target/debug/deps
4.0K ./target/debug/examples
1.1M ./target/debug/.fingerprint
4.0K ./target/debug/native
35M ./target/debug/incremental
16M ./target/debug/build
163M ./target/debug
163M ./target
# cargo +nightly build; du -hd2 ./target
169M ./target/debug/deps
4.0K ./target/debug/examples
1.6M ./target/debug/.fingerprint
4.0K ./target/debug/native
52M ./target/debug/incremental
23M ./target/debug/build
245M ./target/debug
245M ./target
This can obviously be much more problematic on larger crates.
This is currently a feature of Cargo that it aggressively caches, but we could always add a comment to delete otherwise stale artifacts!
Yes, I don't think the current behaviour is wrong. It would just be nice to able to say _"get rid of the old cached stuff now, I'm not going back to rust 1.22 any time soon"_. Particularly for things like Rls where we can expect users to update the toolchain fairly often, but don't really inform them about cleaning ./target/rls/.
Since I always develop on nightly, this is true every day. I'm kind of in the habit now of deleting ./target each morning.
However, it seems like this also affects the cache used by Travis CI, as noted in https://github.com/seanmonstar/reqwest/pull/259
I've experienced this problem both for local development and in Travis.
For example, my Travis CI cache has 56 copies of every dependency; sometimes the dependency is updated and sometimes the compiler is updated. For one crate I looked at, each build was 250K. In total, my cache weighed in at 1.4 GB. This caused the cache to take about 7-9 minutes to download and upload, compared to a build time of ~2 minutes! Some of that cache is from other parts of the build, but after clearing the cache and rebuilding, it's only ~175MB. Roughly 1.2 GB of the Travis cache was composed of these build artifacts.
I also primarily locally develop using nightly, and often find my laptop's disk running out of space because of multiple outdated gigabytes worth of build artifacts.
I agree that cargo clean --outdated would be a good first step. Right now, I do some complicated shell-scriptery to find all the directories that contain a Cargo.toml and run cargo clean in them.
A potential second step would be to do a bit of inspection during a build to note that there are (many? / large?) artifacts lying around that were not used and warn the user about them. A future thing would be akin to git's garbage collection, where unused artifacts older than some date are just automatically removed.
Aside from CI, how common is switching compilers for a given project? I feel like it's not very common.
Can we have an on-by-default pref that specifies whether or not these things are deleted? Travis's image can turn the pref off.
We can also implement something where we record whenever an artifact was last used, and unnecessary artifacts that haven't been touched for a couple days get deleted. But I'd rather go the flat deletion route.
how common is switching compilers for a given project? I feel like it's not very common.
Ideally it happens every 6 weeks at a minimum (stable), or perhaps every few days (nightly).
I mean, switching _back and forth_ between compilers.
If you update your stable or nightly you don't need the old artifacts anymore. It's only an issue if you're switching back and forth, which I feel is rare.
Switching between current stable and current nightly is at least fairly common.
Off of CI? I find that a bit hard to believe.
Either way, this would at least work as a preference, preferably on by default.
Switching between current stable and current nightly is at least fairly common.
probably only before clippy was in stable, though
I was thinking of cargo dev, actually. Perhaps it's more niche than I thought.
I think a very recurring problem we have is that the use cases best
represented on these issue trackers are those of compiler/tool hackers,
which are in a minority otherwise.
cargo dev is definitely a niche use case :)
On Sun, Sep 30, 2018, 12:13 PM Dale Wijnand notifications@github.com
wrote:
I was thinking of cargo dev, actually. Perhaps it's more niche than I
thought.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/5026#issuecomment-425709905,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABivSDskcMScFwjlNVSmWxMOm27qFvoKks5ugJlWgaJpZM4R_ye4
.
My hopefully slightly-less-niche use case was that I'd built software in rust that runs on a server (in my case, an IRC bot that I run on a VM in AWS), set up a cron job to keep the rust compiler up-to-date ($HOME/.cargo/bin/rustup update), and set up the script that starts the bot to update the source code and rebuild, so that restarting the bot would update it to the latest version. The combination of these things ended up filling up the disk space on my VM (i.e., filling up 2GB of storage out of 5GB configured) in about 14 months (April 2017 - June 2018).
To put in two cents from a non-compiler person, I do a lot of cross compiling, and I'd like for the caching to be more aggressive. cargo check seems to utilize a non-target dependent folder, and forces rebuilds when moving from a host build to target build. I'm not sure if auto-gc'ing or something like that is actually intuitive for a build system to do. As a user I'm fine with a cargo clean once in a while - but I can definitely see how this could be a problem in CI.
Maybe a cargo build --throw-away-outdated (excuse the bad arg name) might be the middleground, and CI runners can always opt for that.
I feel like that's an orthogonal problem -- cargo check should be able to reuse metadata from already compiled crates -- it just doesn't.
If cargo check --target foo isn't reusing the target folder I feel like that's a bug.
I don't think what I'm proposing here affects your use case at all -- I'm proposing cleaning up after compiler upgrades, cargo check works differently. It's imperfect but it's not made worse by this proposal.
cargo check should be able to reuse metadata from already compiled crates
Specifically it's #3501
@alexcrichton i might try and implement this, do you have any pointers as to what I should look at?
It sort of depends on the method of implementing, but it'd likely all be around src/cargo/core/compiler/*
Just a rough idea: One option is to completely remove rustc from the metadata hash (here). It is still included in the fingerprint, so when rustc changes cargo will recompile everything.
A more ambitious approach would be to make it so that you can keep multiple release channels cached at the same time. This would require extracting the channel from the version information and only including that in the metadata ("stable", "beta", "nightly"). The version string doesn't explicitly include the channel, so it might take some rough interpretation.
Also, I would be very careful to very fully test things in the rustc repo. Since it builds with multiple stages, I would double check that it won't suddenly start clobbering artifacts. There are some tricky issues involving __cargo_default_lib_metadata.
I don't know if this approach will work, but offhand I can't think of any major issues. I'm not sure how good rustc itself is at cleaning up its incremental directory and dealing with reusing it across versions.
I want to add another problem that these artifacts are causing. It's that the generated code usually ends up somewhere down the target/debug/build/crate_name-id, and sometimes you may need to take a look at it. And when there's a bunch of crate_name-id with different ids, it's tough to figure out which one is the current.
I was going to open this as an issue when I found this one.
I was thinking we need:
cargo clean --stale
Maybe there needs to be:
cargo clean --analyze
To let you see how bad things have gotten. If each incremental has a way to easily detect the version, then perhaps:
cargo clean --stale --version 1.34.0
Regards,
-Dave
Looks like this has been implemented as cargo-sweep.
I have proposed a change in #8073 so that nightly and beta artifacts use the same filenames between versions. Stable artifacts still use different versions as I'm not comfortable with doing that yet (imagine someone testing stable and then their msrv, it could be annoying to trigger rebuilds or break CI caches). My intent is that some kind of gc will get built-in to cargo at some point in the future to address that.
My intent is that some kind of gc will get built-in to cargo at some point in the future to address that.
At that point in time, will the change proposed in #8073 be reverted?
I added a -Zseparate-nightlies flag to disable the new behavior just in case.
At that point in time, will the change proposed in #8073 be reverted?
Probably. It is desired that the gc will be rustup-aware and know which toolchains are still installed. One downside of that approach is that it probably won't run on every build (if there is a performance issue), but I imagine we'll figure things out when the time comes.
Most helpful comment
@alexcrichton i might try and implement this, do you have any pointers as to what I should look at?