Rustfmt: cargo fmt --all is slow in a workspace project

Created on 9 Jun 2020  路  19Comments  路  Source: rust-lang/rustfmt

Problem
cargo fmt --all runs unexpectedly slow in a medium-sized workspace project.

Steps

  1. Run a GitHub workflow that runs cargo fmt --all on the repository.
  2. Check execution time.

This run shows rustfmt and/or cargo pausing for more than 20 s before producing any output in verbose mode. Seen with timestamps in the raw log:

2020-06-08T17:05:32.3268959Z [command]/usr/share/rust/.cargo/bin/cargo fmt --all --verbose -- --check --verbose
2020-06-08T17:05:55.8733086Z [bench (2018)] "/home/runner/work/chain-libs/chain-libs/btree/benches/benchmark.rs"

Notes

Output of cargo version:
cargo 1.44.0 (05d080faa 2020-05-06)

Output of rustfmt --version
rustfmt 1.4.14-stable (e417356 2020-04-21)

Virtual environment information from GitHub workflow:
Operating System
Ubuntu
18.04.4
LTS
Virtual Environment
Environment: ubuntu-18.04
Version: 20200518.1
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu18/20200518.1/images/linux/Ubuntu1804-README.md

Most helpful comment

I think the main issue here is the disconnect between what the --all flag means and when it's actually needed with cargo fmt vs. what folks, understandbly but often incorrectly, assume based on what --all meant previously with other cargo subcommands. (Refs https://github.com/rust-lang/rustfmt/issues/3911)

Background

There's only a few use cases I recall where the --all flag with cargo fmt is currently helpful/meaningful:

  1. Running cargo fmt in a subdirectory of an explicit workspace where you want to run formatting commands against the entire workspace/all the explicit workspace members (all members are formatted by default when running from the root of an explicit workspace)
  2. Running cargo fmt in the root directory of an implicit workspace where you want to run formatting commands against all implicit workspace members
  3. Running cargo fmt against a given package which has one more local/path based dependencies where you want to run the formatting commands against the current package _and_ that local/path based package (as well as any local/path based deps that package may have, recursively).

I suspect that the majority of instances where folks are using the --all flag with cargo fmt falls into those first two buckets, while the third is relatively more of an edge case.

cargo fmt currently leverages the cargo metadata command in order to discover the workspace information which is then used to construct the corresponding rustfmt commands; cargo fmt does not directly parse Cargo.toml files.

Accordingly, in order to support the second (and third) use case, cargo fmt --all runs cargo metadata without the --no-deps flag (due to https://github.com/rust-lang/cargo/issues/7483) to ensure that the cargo metadata output contains the information that cargo fmt needs in order to properly locate and format those implicit workspace members/path-based dependencies.

AIUI, running cargo metadata without the --no-deps flag results in the registry index hit that is presumably the cause of the increased run time folks are seeing which is probably rather apparent in CI-type environments.

Next Steps

We've tried updating the help text for the --all flag though that certainly hasn't solved the disconnect, and I believe we probably need to take additional action to try to ameliorate the confusion.

Perhaps some non-mutually exlusive actions we could consider:

  • Create/augment cargo fmt documentation to try to provide better guidance on when --all is required vs. when it should likely be avoided.
  • Find a way to support formatting implicit workspaces without the index implications and associated perf hit
  • Change behavior of existing cargo fmt flags and/or add new flags (this would almost certainly result in a breaking change that'd require a major version bump). For example, we could _add_ a --workspace or even --explicit-workspace) flag that would support the first use case (explicit workspaces) and use --all, or even rename/replace it with --implicit-workspace to support the second and third use case

All 19 comments

Closing as the problem was tracked down to a cargo issue after all.

This is because rustfmt uses cargo-metadata to find the root file of each project.

By default, rustfmt uses the offline mode of cargo-metadata and falls back to the online mode when it fails. I assume the slowdown you observed is caused by the connection delay; when running cargo fmt --all, we need the cargo index file to be present.

As noted by @ehuss in https://github.com/rust-lang/cargo/issues/8345#issuecomment-641469796:

Sorry, I should have specified in my comment. The issue is here where cargo-fmt calls cargo metadata for every package, twice. It should only call it once for the entire workspace. I was going to make some recommendations over on the rustfmt tracker.

@mzabaluev - does your project have non-workspace member local/path-based dependencies that you also want to format? Running cargo fmt (without the --all) in the root of a workspace will format all the crates in a workspace by default. The --all flag isn't necessary in most situations.

From the cargo fmt help text:

--all                          Format all packages, and also their local path-based dependencies

FYI the multiple cargo metadata calls rustfmt is making is due to https://github.com/rust-lang/cargo/issues/7483 :smile:

@calebcartwright Thanks for the tip! One of our larger projects does use path-based dependencies pulled in by a git submodule, but these can be format-checked in a CI workflow of that submodule's repository.

Great! In that case I'd recommend just running cargo fmt and dropping the --all flag so you won't have to worry about the registry index at all

@calebcartwright I can help rework the logic for the --all case if you'd like. I don't see that it is necessary to re-fetch the metadata for each package, so it should be able to do it just once. I'm not too familiar with rustfmt development, but I can try to make a PR.

@ehuss - sure! The --all use case is a bit of a pain, especially with the need to maintain the behavior when cargo fmt is executed from subdirectories within the workspace, so any improvements there would certainly be appreciated.

Additionally, it would also be really helpful for rustfmt if we can get a solution/alternative to rust-lang/cargo#7483 so that we can always use --no-deps while still being able to locate the paths for the non-workspace member local/path based dependencies (and their respective deps) that also get formatted with --all

Ran into this today as well, and was confused both why cargo fmt --all took 20 s while cargo fmt took 1.3 s, and why in our workspace the appeared to work identically.

This was on Windows (though through Git Bash) in a workspace of 28 crates.

$ time cargo fmt --all

real    0m20.996s
user    0m0.000s
sys     0m0.000s

$ time cargo fmt

real    0m1.298s
user    0m0.000s
sys     0m0.016s

All of our crates are in our root Cargo.toml, so I guess don't need --all. But for other cargo commands we use --workspace (which used to be called --all) to indicate that we want to operate over all the crates in the workspace.

I think the main issue here is the disconnect between what the --all flag means and when it's actually needed with cargo fmt vs. what folks, understandbly but often incorrectly, assume based on what --all meant previously with other cargo subcommands. (Refs https://github.com/rust-lang/rustfmt/issues/3911)

Background

There's only a few use cases I recall where the --all flag with cargo fmt is currently helpful/meaningful:

  1. Running cargo fmt in a subdirectory of an explicit workspace where you want to run formatting commands against the entire workspace/all the explicit workspace members (all members are formatted by default when running from the root of an explicit workspace)
  2. Running cargo fmt in the root directory of an implicit workspace where you want to run formatting commands against all implicit workspace members
  3. Running cargo fmt against a given package which has one more local/path based dependencies where you want to run the formatting commands against the current package _and_ that local/path based package (as well as any local/path based deps that package may have, recursively).

I suspect that the majority of instances where folks are using the --all flag with cargo fmt falls into those first two buckets, while the third is relatively more of an edge case.

cargo fmt currently leverages the cargo metadata command in order to discover the workspace information which is then used to construct the corresponding rustfmt commands; cargo fmt does not directly parse Cargo.toml files.

Accordingly, in order to support the second (and third) use case, cargo fmt --all runs cargo metadata without the --no-deps flag (due to https://github.com/rust-lang/cargo/issues/7483) to ensure that the cargo metadata output contains the information that cargo fmt needs in order to properly locate and format those implicit workspace members/path-based dependencies.

AIUI, running cargo metadata without the --no-deps flag results in the registry index hit that is presumably the cause of the increased run time folks are seeing which is probably rather apparent in CI-type environments.

Next Steps

We've tried updating the help text for the --all flag though that certainly hasn't solved the disconnect, and I believe we probably need to take additional action to try to ameliorate the confusion.

Perhaps some non-mutually exlusive actions we could consider:

  • Create/augment cargo fmt documentation to try to provide better guidance on when --all is required vs. when it should likely be avoided.
  • Find a way to support formatting implicit workspaces without the index implications and associated perf hit
  • Change behavior of existing cargo fmt flags and/or add new flags (this would almost certainly result in a breaking change that'd require a major version bump). For example, we could _add_ a --workspace or even --explicit-workspace) flag that would support the first use case (explicit workspaces) and use --all, or even rename/replace it with --implicit-workspace to support the second and third use case

There are breaking changes planned for v2.0 sometime soon, right? If that can include the CLI interface, I think --all should be renamed as part of that to avoid this confusion, something like --also-path-deps makes it much more obvious _what_ it's doing and avoids the connotations with what --all means in other cargo subcommands.

(Whether to change the default to not do the entire workspace and add a --workspace flag is less pressing an issue, not having such an obvious "oh, this must be the workspace flag" might encourage users to read the help text better and notice that it defaults to the workspace (assuming that ... of the current crate ... is changed to ... of the current workspace ...)).

(Actually, even if there are no breaking changes, it'd be nice to rename this flag and add a deprecated alias for it, maybe even hiding that from the help text in the future).

@calebcartwright Since it looks like https://github.com/rust-lang/cargo/pull/8994 is probably going in the right direction, maybe it'd make sense to start up a draft PR here that uses it when available so we can maybe even fix this on the next beta (which I think is Dec 31st). What do you think? If you're busy I can try to find some spare cycles to write it up, though you know the code better than I do.

so we can maybe even fix this on the next beta (which I think is Dec 31st)

Won't commit to any target timeframe just yet, especially given the holidays, but yes will try to get this out soon once possible to proceed. Unfortunately due the various non-rustup channels through which rustfmt is distributed and can be used there will be a bit of extra complexity to handle working with older versions of cargo that may not the newer changes.

I'm also not sure if the cargo_metadata crate will need to first be updated as well before we can start taking advantage of https://github.com/rust-lang/cargo/pull/8994

Yeah, I think we'll probably have to track whether we see a local dependency (source == null) but not a manifest_path, and in that case fall back to the current way of doing things. It's not pretty, but I agree we can't drop support for older cargos.

I'll take a look at adding support to cargo_metadata 馃憤

There's a promising new upstream PR here that might unblock this: https://github.com/rust-lang/cargo/pull/9024

https://github.com/rust-lang/cargo/pull/8994 was merged, and will land (if all goes according to plan) in 1.51. The change to cargo_metadata is in https://github.com/oli-obk/cargo_metadata/pull/149.

https://github.com/oli-obk/cargo_metadata/pull/149 has been merged @calebcartwright, so I think we now have all we need to start working on an implementation in fmt!

Was this page helpful?
0 / 5 - 0 ratings