Cargo: Rerun-if-changed without disabling other heuristics as a side-effect

Created on 6 Oct 2017  路  9Comments  路  Source: rust-lang/cargo

Currently any use rerun-if-changed opts in entire build into a completely "manual" tracking, disabling all built-in rules as a side effect. This means it's not possible to safely just add one file/env var check to the build without understanding all the dependencies of the entire build script.

In order to use rerun-if-changed features in build-time libraries without potentially breaking rest of the build, it's necessary to be able to use rerun-if-changed-like functionality without disabling the built-in heuristic (i.e. rerun if a var/file has changed in addition to all the other/default conditions that cause the build to be rebuilt).

https://github.com/alexcrichton/pkg-config-rs/pull/45#issuecomment-334552552

bikeshed syntax: cargo:rerun-if-also-changed=file, cargo:rerun-if-changed+=file, cargo:rerun-if-changed;keep-defaults=file

A-build-scripts

Most helpful comment

I'm also OK with changing the existing behavior to be additive without side effects, and adding another explicit option like cargo:disable-built-in-heuristics.

All 9 comments

The current semantics of rerun-if-changed and rerun-if-changed-env are confusing and a new key rerun-if-changed+ might just add to this confusion. Notice that

rerun-if-changed+=foo
rerun-if-changed+=bar
rerun-if-changed=foobar

is effectively the same as without the +-key.

rerun-if-changed+=foo
rerun-if-changed+=bar
rerun-if-change-env=CC

causes Cargo to still ignore the package sources due to a reference to an env-var and use foo, bar, [CC] as the only fingerprints.

IMHO rerun-if-changed+ in addition to the current system is inferior to changing rerun-if-changed and rerun-if-changed-env to add fingerprints to the then already-considered package sources.

The current implementation calls for the build-script to walkdir() it's sources, including those that might be out-of-tree. The rationale is that the user has already opted in to do dependency selection himself and a walkdir('./src') is easily implemented.
Another approach might be to selectively opt-out of package-sources and let rerun-if-changed work as rerun-if-changed+ would. The user can then escalate an "also consider this" to "just consider this".

I'm also OK with changing the existing behavior to be additive without side effects, and adding another explicit option like cargo:disable-built-in-heuristics.

Can disabling of built-in heuristics as a side effect be fixed (i.e. side-effect removed)?

Does building more than necessary count as a breaking change? If so, can it be done as part of 2018 edition?

Yes changing the default behavior would be a breaking change, but we could always add a new option for build scripts (or something like that) to say "use the default heuristics". Either that or a crate could be published to crates.io which is a one-liner to do what Cargo currently does.

we could always add a new option for build scripts (or something like that) to say "use the default heuristics"

I'd love to have that option. This seems like a relatively straightforward addition to Cargo. Modulo bikeshedding the name, what would this take to add? A full RFC, or just a PR and consensus from the cargo team?

Can you include some concrete use cases? Would this be for generic build libraries like pkg-config? Is this important for adding rerun-if for files outside of the package directory (like system libraries and binaries)?

The default behavior is pretty dumb and covers all files in the package directory (modulo include/exclude/gitignore). It is pretty rare where that is the best behavior, so it is preferred for the script to turn off the default behavior and narrow its change detection scope.

If there is a command to use default heuristics, then you couldn't turn it off? I think that would be bad if some generic library set that and then require the user to do something to turn it off.

Is it more important for env vs files? One idea is if there is only env instructions to not turn off the automatic file detection. Perhaps investigating how common that is could guide the decision.

It is for build-time dependencies, which may know about some subset of env vars and files which definitely need to cause a rebuild, but are not in charge of building the whole crate, so they can't make a global change to the default heuristic behavior.

Env vars are the primary use-case here. Env vars can significantly affect the build, but AFAIK they're not all automatically included in the default heuristics, so they really really should always be reported as affecting the build.

For files it's more of a nice to have, since the default heuristic of watching all files is correct most of the time.

The use case of build-time dependencies emitting build script output seems like an excellent argument against a global cargo directive for "keep the default heuristics", and an argument in favor of new directives for "add this environment variable but don't drop the defaults" and "add this file but don't drop the defaults".

As a relevant example where this is causing trouble:

prost_build which is used by various projects for generating code from protocol buffer specifications (and would ideally be able to automatically ensure such code is regenerated if the specifications change) currently shies away from emiting rerun-if-changed lines because they are concerned about the breaking side-effects it might have on projects:

https://github.com/danburkert/prost/blob/9551f2852e414df72339634087c46ce5a08e85b2/prost-build/src/lib.rs#L504

For my use case this then means I need to emit these lines manually, because I have .proto files that live outside of my project directory so the catch-all heuristics aren't enough.

It's not clear to me what could go wrong with making rerun-if-changed purely additive. Then as a separate issue maybe create some option for disabling the default catch-all dependency tracking heuristics.

Although it's technically a change in behaviour, projects are currently blocked from correctly handling dependencies automatically with this design which seems like a bigger issue than possibly triggering some redundant re-builds in some cases?

Was this page helpful?
0 / 5 - 0 ratings