There should be an option to only build dependencies.
@nagisa,
Why do you want it?
I do not remember exactly why, but I do remember that I ended just running rustc manually.
@posborne, @mcarton, @Devyn,
You reacted with thumbs up.
Why do you want it?
Sometimes you add a bunch of dependencies to your project, know it will take a while to compile next time you cargo build, but want your computer to do that as you start coding so the next cargo build is actually fast.
But I guess I got here searching for a cargo doc --dependencies-only, which allows you to get the doc of your dependencies while your project does not compile because you'd need the doc to know how exactly to fix that compilation error you've had for a half hour :smile:
As described in #3615 this is useful with build to setup a cache of all dependencies.
@gregwebs out of curiosity do you want to cache compiled dependencies or just downloaded dependencies? Caching compiled dependencies isn't implemented today (but would be with a command such as this) but downloading dependencies is available via cargo fetch.
Generally, as with my caching use case, the dependencies change infrequently and it makes sense to cache the compilation of them.
The Haskell tool stack went through all this and they seemed to generally decided to merge things into a single command where possible. For fetch they did end up with something kinda confusing though: build --dry-run --prefetch. For build --dependencies-only mentioned here they do have the same: build --only-dependencies
@gregwebs ok thanks for the info!
@alexcrichton,
It looks like I should continue my work on the PR.
Will Cargo's team accept it?
@KalitaAlexey I personally wouldn't be convinced just yet, but it'd be good to canvas opinions from others on @rust-lang/tools as well
@alexcrichton,
Anyway I have no time right now)
I don't see much of a use case - you can just do cargo build and ignore the output for the last crate. If you really need to do this (for efficiency) then there is API you can use.
What's the API?
Implement an Executor. That lets you intercept every call to rustc and you can do nothing if it is the last crate.
I wasn't able to find any information about an Executor for cargo. Do you have any links to documentation?
Docs are a little thin, but start here: https://github.com/rust-lang/cargo/blob/609371f0b4d862a94e2e3b8e4e8c2a4a2fc7e2e7/src/cargo/ops/cargo_rustc/mod.rs#L62-L64
You can look at the RLS for an example of how to use them: https://github.com/rust-lang-nursery/rls/blob/master/src/build.rs#L288
A question of Stack Overflow wanted this feature. In that case, the OP wanted to build the dependencies for a Docker layer.
A similar situation exists for the playground, where I compile all the crates once. In my case, I just put in a dummy lib.rs / main.rs. All the dependencies are built, and the real code is added in the future.
@shepmaster unfortunately the proposed solution wouldn't satisfy that question because a Cargo.toml won't parse without associated files in src (e.g. src/lib.rs, etc). So that question would still require "dummy files", in which case it wouldn't specifically be serviced by this change.
I ended up here because I also am thinking about the Docker case. To do a good docker build I want to:
COPY Cargo.toml Cargo.lock /mything
RUN cargo build-deps --release # creates a layer that is cached
COPY src /mything/src
RUN cargo build --release # only rebuild this when src files changes
This means the dependencies would be cached between docker builds as long as Cargo.toml and Cargo.lock doesn't change.
I understand src/lib.rs src/main.rs are needed to do a good build, but maybe build-deps simply builds _all_ the deps.
I came to this thread because I also wanted the docker image to be cached after building the dependencies. After later resolving this issue, I posted something explaining docker caching, and was informed that the answer was already linked in the stackoverflow post. I made this mistake, someone else made this mistake, it's time to clarify.
RUN cd / && \
cargo new playground
WORKDIR /playground # a new project has a src/main.rs file
ADD Cargo.toml /playground/Cargo.toml
RUN cargo build # DEPENDENCIES ARE BUILD and CACHED
RUN cargo build --release
RUN rm src/*.rs # delete dummy src files
# here you add your project src to the docker image
After building, changing only the source and rebuilding starts from the cached image with dependencies already built.
someone needs to relax...
Also @KarlFish what you're proposing is not actually working. If using FROM rust:1.20.0.
cargo new playground fails because it wants USER env variable to be set.RUN cargo build does not build dependencies for release, but for debug. why do you need that?Here's a better version.
FROM rust:1.20.0
WORKDIR /usr/src
# Create blank project
RUN USER=root cargo new umar
# We want dependencies cached, so copy those first.
COPY Cargo.toml Cargo.lock /usr/src/umar/
WORKDIR /usr/src/umar
# This is a dummy build to get the dependencies cached.
RUN cargo build --release
# Now copy in the rest of the sources
COPY src /usr/src/umar/src/
# This is the actual build.
RUN cargo build --release \
&& mv target/release/umar /bin \
&& rm -rf /usr/src/umar
WORKDIR /
EXPOSE 3000
CMD ["/bin/umar"]
You can always review the complete Dockerfile for the playground.
Hi!
What is the current state of the --deps-only idea? (mainly for dockerization)
I agree that it would be really cool to have a --deps-only option so that we could cache our filesystem layers better in Docker.
I haven't tried replicating this yet, but it looks very promising. This is in glibc and not musl, by the way. My main priority is to get to a build that doesn't take 3-5 minutes ever time, not a 5 MB alpine-based image.
I ran into wanting this today.
@steveklabnik can you share more details about your case? Why would building only the dependencies be useful? Is starting with an empty lib.rs/main.rs not applicable to your situation?
As an aside, a cargo extension (cargo-prebuild?) could probably move aside the main.rs / lib.rs, replace it with an empty version, call cargo build, then move it back. That would "solve" the problem.
I wanted to time a clean build, but of my crate only, and not its dependencies. The easiest way to do that is to cargo clean followed by a hypothetical cargo build --deps-only, then time a cargo build.
Yeah, I mean, I could replace everything, and then pull it back, but that feels like working around things rather than doing what I actually want to do.
@shepmaster the replacing stuff gets very hairy when you have a workspace project with custom build.rs in multiple subprojects. My Dockerfile is now _very_ fragile and complicated.
Also on the workspace note, i don't even try to cache/shorten build times for the subprojects, only the main, so it's suboptimal as well. Any source change in a subproject means doing a full build without caching anything.
with custom build.rs
What impact does a build.rs have on your case? What do you expect a hypothetical cargo build --deps-only to do in the presence of a build.rs? Do you believe that this behavior applies to all cases using a build.rs?
when you have a workspace
What do you expect a hypothetical cargo build --deps-only to do in a workspace with two crates (A and B) where A depends on B? Is B built? Is A built?
What impact does a build.rs have on your case?
My (several) build.rs are for invoking bindgen around c-libs. Some are simply: bindgen + include!(concat!(env!("OUT_DIR"), "/bindings.rs")); in the lib.rs, and some lib.rs add a bit more to the bindings
I guess my expectation of a cargo build --deps only in a workspace scenario would be that, as a minimum, it would build any non project local deps of those submodules together with the non project local deps of my main lib (like libc, which is often in my submodules).
I only mention build.rs because the solution of doing empty projects, build, and then copying in the source becomes more messy when you also have a bespoke build with git submodules of various c sources built via a build.rs.
Maybe a cargo build --deps-only would do build _only dependencies that are not local to the project_. I.e. any dep that in a Cargo.toml (root level or submodule) that references the central cargo repo, own repos, or git sources.
I think it would be useful to decide whether to accept or reject this as a desirable feature (even though there is no implementation yet) and clearly state that at the top of the issue (or a new issue).
I'd also like to have this feature. I'm also doing a Docker cache thing.
@shepmaster 's Dockerfile isn't working in my case (cross compiling stuff).
I am very sad this got closed, this would be great for docker builds, and way less hacky than current solutions. I feel docker builds in their own right are also a good enough reason alone
@cmac4603 this has not been closed
Oh my goodness! I'm so glad! Yup, totally misread, thanks @mathroc
I would love to see this for docker builds then, particularly local dev builds involving docker-compose. Generally for final test/prod builds, I use a musl builder from scratch, but for development, would be a very nice-to-have in the toolkit
downloading dependencies is available via cargo fetch
cargo fetch downloads too much in general without having something like #5216 to allow the fetch to be constrained to a target.
I had a need for this feature today for pre-building a docker image cache. The trick to cargo init a fresh project and reusing only one's Cargo.toml _almost_ worked, but I have a build.rs script, so I had to make a stub build.rs file too. It would be easier to have a "standard" way of downloading and building the deps that doesn't depend on the particulars of your build and handles your corner cases correctly.
but I have a build.rs script, so I had to make a stub build.rs file too
@golddranks Why did you need it though? If the reason is that you have build = "build.rs" line in Cargo.toml, then just remove that line as it's not required nowadays and build.rs is automatically assumed to be a build script if found.
Ah, didn't know that. Thanks.
_Yet another_ reason to have this is that some CIs (AppVeyour in particular) have a limit on how much information is logged from the build. With cargo build --dependencies-only one can build dependencies normally, then switch to cargo build -vv for the source code that's being the primary target of the CI build.
I think we should probably consider the docker use-case in more detail. I have a hunch that there's more to the use-case than meets the eye.
I think that the CI use-case is a more interesting rationale for this feature, but I wonder if it argues instead for more control over the verbosity options (--crate-verbosity "my-crate=vv" or something along those lines), which I could imagine being useful for other purposes ("I really want to look more closely at what's going on when I build clap")
I came here looking for the same use-case of wanting to properly layer my Docker images to decrease build times.
I ended up with the following solution, in case this helps anyone else:
FROM ekidd/rust-musl-builder AS builder
RUN sudo chown -R rust:rust /home/rust
RUN mkdir src && touch src/lib.rs
COPY Cargo.lock .
COPY Cargo.toml .
RUN cargo build --release
ADD . .
RUN cargo build --release
FROM scratch
COPY --from=builder \
/home/rust/src/target/x86_64-unknown-linux-musl/release/my-app \
/my-app
ENTRYPOINT ["/my-app"]
Here's the result of the first run, vs a second run with only source code changes, no dependency changes:
collapsed wall of text
❯ time docker build -t hello .; and docker run hello
Sending build context to Docker daemon 225.3kB
Step 1/10 : FROM ekidd/rust-musl-builder AS builder
---> 792e654be291
Step 2/10 : RUN mkdir src/ && touch src/lib.rs
---> Running in 2208ab6053f4
Removing intermediate container 2208ab6053f4
---> 572926d2738a
Step 3/10 : COPY Cargo.lock .
---> e8eb753e4881
Step 4/10 : COPY Cargo.toml .
---> 43f3d04fe0e8
Step 5/10 : RUN cargo build --release
---> Running in 374ba4a8a864
Updating registry `https://github.com/rust-lang/crates.io-index`
<SNIP>
Finished release [optimized] target(s) in 43.65s
Removing intermediate container 374ba4a8a864
---> 843e36ec3ce7
Step 6/10 : ADD . .
---> c66cc9b7f4dc
Step 7/10 : RUN cargo build --release
---> Running in bcd95a11ff47
Compiling hello-world v0.1.0 (file:///home/rust/src)
Finished release [optimized] target(s) in 0.40s
Removing intermediate container bcd95a11ff47
---> 1789bdcc6ed3
Step 8/10 : FROM scratch
--->
Step 9/10 : COPY --from=builder /home/rust/src/target/x86_64-unknown-linux-musl/release/hello-world /hello-world
---> Using cache
---> e8501d0c8738
Step 10/10 : ENTRYPOINT ["/hello-world"]
---> Using cache
---> fc179245d85b
Successfully built fc179245d85b
Successfully tagged hello:latest
53.53 real 2.17 user 0.80 sys
hello world
❯ time docker build -t hello .; and docker run hello
Sending build context to Docker daemon 225.3kB
Step 1/10 : FROM ekidd/rust-musl-builder AS builder
---> 792e654be291
Step 2/10 : RUN mkdir src/ && touch src/lib.rs
---> Using cache
---> 572926d2738a
Step 3/10 : COPY Cargo.lock .
---> Using cache
---> e8eb753e4881
Step 4/10 : COPY Cargo.toml .
---> Using cache
---> 43f3d04fe0e8
Step 5/10 : RUN cargo build --release
---> Using cache
---> 843e36ec3ce7
Step 6/10 : ADD . .
---> a9698b8f6a88
Step 7/10 : RUN cargo build --release
---> Running in 980a6d70a441
Compiling hello-world v0.1.0 (file:///home/rust/src)
Finished release [optimized] target(s) in 0.35s
Removing intermediate container 980a6d70a441
---> 1661e048b501
Step 8/10 : FROM scratch
--->
Step 9/10 : COPY --from=builder /home/rust/src/target/x86_64-unknown-linux-musl/release/hello-world /hello-world
---> d8b93e6a4ff9
Step 10/10 : ENTRYPOINT ["/hello-world"]
---> Running in e26f084ed361
Removing intermediate container e26f084ed361
---> 6684afc6b641
Successfully built 6684afc6b641
Successfully tagged hello:latest
7.43 real 2.26 user 1.00 sys
hello universe!
Same docker use case here. We can do the above, but anyone reading it would wonder what's going on. It would be much clearer to have an -dependencies-only flag.
Is the question whether there are sufficient use cases (Docker, performance timing, any others?)? (I'm assuming the cargo implementation would be fairly simple.).
To be fair docker is a pretty big use case...
Same docker use case here. I would _love_ this feature.
Same docker use case here. I don't want to have to pull and compile all the dependencies in my image everytime I change the source code.
I tried @JeanMertz approach but when you've got a workspace with a few crates in it, it's not so pretty. docker's COPY doesn't seem to like recursive globs, COPY **/Cargo.toml . for example doesn't seem to work for me.
That said, I'm not sure there's a nice way of doing this, because even if we said build deps only, if we copied in all our source to the docker image and the source had changed then it would be a cache miss anyway. I think the best we can ever do with docker is to generalise @JeanMertz 's approach.
Note that there is also https://crates.io/crates/cargo-build-deps.
@nrc i tried the 'Executor' path to no avail (https://github.com/azban/cargo-build-deps/blob/executor/src/main.rs). i had to hack around fingerprint updates because there is no way to be lenient here. but i finally got stuck on ignoring custom build scripts, because they don't use the executor and rather call commands directly here.
i ended up hacking around build_plan, which is messy because seemingly the only way to get it, is from stdout, and i was making assumptions about how to tell if something is an external dependency or not. (https://github.com/azban/cargo-build-deps/blob/build-plan/src/main.rs)
@JeanMertz @gilescope @emilk @anuraags , i ended up 'generalizing' the approach above in order to support more complex crates (workspaces mostly). mostly involves a separate build image which you only pass what's required to build (Cargo.toml, Cargo.lock, etc) into the docker context.. and then inherit from that image in a later build. you can check out an example here: https://github.com/paritytech/substrate/pull/1085/commits/a12e3040bca581e3e3a5b18c4de218499d3de937
Unfortunately, in many usecases approach based on Cargo.toml and Cargo.lock can't really work, because you need to update at least current crate version. And therefore Docker will invalidate cache immediately, even if dependencies versions are still the same.
I was recently experimenting with BuildKit approach: every rustc or build script invocation is done in own build stage (something like this). The image builder can manage "cache invalidation" and building concurrency on its own then.
I can call it a successful experiment, but I decided to go even deeper - I'm currently trying to build a BuildKit frontend for Cargo crates. This will help to achieve a mind-blowing
docker build -t my-tag -f Cargo.toml .
Apart from these experiments, you can use an "almost" production-ready caching approach today! You need to have Docker 18.06+, DOCKER_BUILDKIT=1 environment variable, and a special Dockerfile syntax:
# syntax=docker/dockerfile-upstream:experimental
FROM rustlang/rust:nightly as builder
# Copy crate sources and workspace lockfile
WORKDIR /rust-src
COPY Cargo.lock /rust-src/Cargo.lock
COPY cargo-container-tools /rust-src
# Build with mounted cache
RUN --mount=type=cache,target=/rust-src/target \
--mount=type=cache,target=/usr/local/cargo/git \
--mount=type=cache,target=/usr/local/cargo/registry \
["cargo", "build", "--release"]
# Copy binaries into normal layers
RUN --mount=type=cache,target=/rust-src/target \
["cp", "/rust-src/target/release/cargo-buildscript", "/usr/local/bin/cargo-buildscript"]
RUN --mount=type=cache,target=/rust-src/target \
["cp", "/rust-src/target/release/cargo-test-runner", "/usr/local/bin/cargo-test-runner"]
RUN --mount=type=cache,target=/rust-src/target \
["cp", "/rust-src/target/release/cargo-ldd", "/usr/local/bin/cargo-ldd"]
# Copy the binaries into final stage
FROM debian:stable-slim
COPY --from=builder /usr/local/bin/cargo-buildscript /usr/local/bin/cargo-buildscript
COPY --from=builder /usr/local/bin/cargo-test-runner /usr/local/bin/cargo-test-runner
COPY --from=builder /usr/local/bin/cargo-ldd /usr/local/bin/cargo-ldd
# syntax=... defines a Dockerfile builder frontend image and caching is done with --mount flags of RUN instruction.
Update: corrected Dockerfile stage names.
Update 2: changed nightly Cargo registry cache path.
After trying out cargo-build-deps, I noticed that it runs cargo update and seems slower than latest cargo build in Rust nightly. I would say it probably makes sense to just bake this functionality into cargo main for Docker and caching purposes.
awesome @denzp .. i hadn't checked out buildkit, and it looks like good timing as it's no longer experimental on the docker-ce released today :). i'm going to favor this over cargo-build-deps.
@denzp That looks promising. Do you think it would be possible to support both cargo build and cargo check with this? Currently I'm doing something like this mess (based on the suggestions above + rust-aws-lambda):
ADD $SRC/Cargo.toml $SRC/Cargo.lock ./
RUN mkdir src && touch src/lib.rs
ARG checkonly
RUN if [ "x$checkonly" = "x" ] ; then cargo build --target $BUILD_TARGET --release ; else cargo check ; fi
ADD $SRC/src src
RUN if [ "x$checkonly" = "x" ] ; then cargo build --target $BUILD_TARGET --release ; else cargo check && echo "return early after check, this is not an error" && false ; fi
which seems to cache both branches (with/without checkonly supplied as build-arg) separately
@J-Zeitler in your example, change of checkonly build argument alters the whole command. And therefore Docker decides to drop the cached result layer (containing target dir with build artifacts).
With RUN --mount it shouldn't be a problem anymore, because regardless of the cached layer presence, target dir with Cargo cache will be preserved.
Looks like you can simplify the Dockerfile and use only the second command. But is there a reason to run cargo check during image building stage?
@denzp Cool, will try to find some time to test it.
The reason is that I'm on a windows machine and building for an "amazonlinux" host. I guess I could run cargo check locally. Will cargo check always produce the same output regardless of --target?
(also, I noticed that I actually don't even use --target in the check command so it might not do what I want anyways)
eidt: also, yes, with a proper cache I could probably remove the second command. What I do now (as the others above) is using docker layers as cache and ADDing $SRC/src src will evict the cached layers below. That's the reason people poke in the fake src/lib.rs etc
Any new information on this issue? I think a lot of people would love this feature. It would make developing a lot easier in cases. In my project whenever the Rust guys change their code the build takes forever. Having a dependency layer would decrease the build time of our project from 15 minutes to about 45-60 ish seconds
Everyone is on-board with solving the problem, though exactly how is a bit up for debate. We do plan to address it soon though.
@nrc Great to hear!
And here is my version of a workaround, in case anyone else is interested.
FROM rust:slim AS builder
# Copy dependency information only.
WORKDIR /home/
COPY ./Cargo.toml ./Cargo.lock ./
# Fetch dependencies to create docker cache layer.
# Workaround with empty main to pass the build, which must be purged after.
RUN mkdir -p ./src \
&& echo 'fn main() { println!("Dummy") }' > ./src/main.rs \
&& cargo build --release --bin helloworld \
&& rm -r ./target/release/.fingerprint/helloworld-*
# Cache layer with only my code
COPY ./ ./
# The real build.
RUN cargo build --frozen --release --bin helloworld
# The output image
FROM debian:stable-slim
COPY --from=builder /home/target/release/helloworld /opt/app
ENTRYPOINT [ "/opt/app" ]
Running into the same issue, and using Docker. Are we planning on tackling this?
Based on this Stackoverflow question, there are several workaround, but a Cargo flag seems more convenient than what is possible. Workarounds:
main.rs/lib.rs, then compile. Afterwards remove the fake source and add the real ones. [Caches dependencies, but several fake files with workspaces]main.rs/lib.rs, then compile. Afterwards create a new layer with the compiled dependencies and continue from there. [Similar to above]--mount]RUN --mount=type=cache,target=/the/path cargo build in the Dockerfile in new Docker versions. [Caches everything, seems like a good way, but currently too new to work for me. Executable not part of image]Special note for workspaces: even with a flag like --dependencies-only it'll be a bit inconvenient to individually add all the Cargo files to Docker for this to work, but I don't think there's anything that can be done about that on the Cargo side.
It feels like, having --dependencies-only won't help for the Docker workflow: eventually you will need to bump a package version, or add dependency which will invalidate cache (due to either Cargo.lock or Cargo.toml change).
On the other hand, even after a while working with Rust microservices deployed as Docker images I can confirm that RUN --mount=type=cache works like a charm! But indeed, it requires a modern infrastructure - docker-ce v18.09.
I think that is the desired behavior: if the dependencies change, the
"dependency-only" layer changes, if they don't, they don't. Slow full
builds are not the end of the world, it's just them being the common case
that is the problem.
On Wed, Mar 13, 2019 at 1:17 PM Denys Zariaiev notifications@github.com
wrote:
It feels like, having --dependencies-only won't help for the Docker
workflow: eventually you will need to bump a package version, or add
dependency which will invalidate cache (due to either Cargo.lock orCargo.toml change).
On the other hand, even after a while working with Rust microservices
deployed as Docker images I can confirm that RUN --mount=type=cache works
like a charm! But indeed, it requires a modern infrastructure - docker-ce
v18.09.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-472588303,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABxXhvG_WyI74JTMhMb_hyw62x7AfIsfks5vWVzagaJpZM4IXR7U
.
@ohAitch Is on point. Of course full builds will take a while. But there's a big difference between me changing a line in my source code and going through the full cargo dependency tree or just my source code. If dependencies change obviously there is no way to not have to rebuild them but that doesn't happen nearly as often as source code or resource changes
It is possible to not have to rebuild all of them, which I think is the
caching solution; but this issue's titular suggestion is a lot more
explicit, and thus straightforward to integrate with other Docker-like
systems such as Nix
On Wednesday, 13 March 2019, Tristan Schönhals notifications@github.com
wrote:
@ohAitch https://github.com/ohAitch Is on point. Of course full builds
will take a while. But there's a big difference between me changing a line
in my source code and going through the full cargo dependency tree or just
my source code. If dependencies change obviously there is no way to not
have to rebuild them but that doesn't happen nearly as often as source code
or resource changes—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-472608164,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABxXhn6C9lBnsvSBRdOPrCj2efuvyy43ks5vWWo4gaJpZM4IXR7U
.
So, the dummy file doesn't really work on workspaces, which is the issue I currently have. And I think --deps-only would be handy specially for when you have workspaces as it could automate that entire process for you.
@felipellrocha - the dummy file _does_ work with workspaces... sort of. What you end up having to do is recreate the structure of your entire workspace in the Docker image, which is super brittle and not ideal.
A --dependencies-only flag is the right solution here, but know that you can still use workspaces while you wait for a native solution and not a hacky one.
So, if you have a Cargo.toml for your workspace similar to this one:
[workspace]
members = [
"shared_library",
"bin_one",
"bin_two",
"bin_three",
]
You're going to end up with a directory structure more or less similar to this:
.
├── Cargo.lock
├── Cargo.toml
├── bin_one
│ ├── Cargo.toml
│ └── src
│ ├── lib
│ │ └── ...
│ └── main.rs
├── bin_two
│ ├── Cargo.toml
│ └── src
│ ├── lib
│ │ └── ...
│ └── main.rs
├── bin_three
│ ├── Cargo.toml
│ └── src
│ ├── lib
│ │ └── ...
│ └── main.rs
├── Dockerfile
├── docker
│ └── dummy
│ ├── bin_one
│ │ ├── Cargo.toml
│ │ └── src
│ │ └── main.rs
│ ├── bin_two
│ │ ├── Cargo.toml
│ │ └── src
│ │ └── main.rs
│ ├── bin_three
│ │ ├── Cargo.toml
│ │ └── src
│ │ └── main.rs
│ └── shared_lib
│ ├── Cargo.toml
│ └── src
│ └── lib.rs
└── shared_lib
├── Cargo.toml
└── src
├── lib.rs
└── stuff
Where your root Cargo.toml, Cargo.lock and your docker/dummy/* all get copied into the Docker image so you can build your release dependencies. Then can can come back later and COPY your _actual_ source into the image and build _that_.
I haven't found a better solution for this, if you have one - I'd absolutely love to know, because right now it's a very manual process and I'd feel really bad for anyone who has to manage more complexity than this or has to figure this out on their own.
I've used the same strategy as @davidarmstronglewis for building my workspaces.
It's been finicky to set up, fragile to maintain, and difficult to document/explain to new team members.
is recreate the structure of your entire workspace in the Docker image, which is super brittle and not ideal.
A
--dependencies-onlyflag is the right solution here
For Docker, wouldn't you have to copy in all of the same Cargo.toml files in the same directory structure in order to specify the dependencies? At the same time, you'd also have to avoid copying in any source files at that point, otherwise changes to your source code would invalidate the dependencies layer.
While I'd like to have a --dependencies-only flag, I don't follow how simply having it will actually make the Docker situation appreciably better. Is there a Dockerfile command to add all files matching a name, preserving their directory structure but not copying any other files?
Could someone sketch out exactly how this would work in a world where:
--dependencies-onlyI think the main difference of a situation like this would be not having to create dummy projects anymore. For my project I found a decent workaround by
cargo build --release && rm ./src/*.rs ./target/release/deps/<project>*From what I can tell this situation would not greatly improve by having the flag, but it would reduce the initial barrier and cranky-ness of setting this up (especially when building multiple modules relying on each other). The flag would (in my example) allow skipping 1 and the && part of 3. I'm assuming it would also speed up the dependency build by a bit.
Of course this is only one example for how it would have affected me. At the time of commenting on this issue previously (a couple months ago) I did not have a working solution and it was really annoying. This approach does work well enough for my usage but of course having the --dependencies-only flag would be a nice to have for sure and clean up my Dockerfile/make it a lot easier to explain what's happening.
Furthermore it would prevent me from having to fiddle around in the /target directory. Maybe that's just me but I thought it was a really dirty solution
@shepmaster Yeah for workspaces, it doesn't help all that much, because we still have to add all Cargo files one by one.
What is saves is not having to generate dummy main/lib files, and delete them afterwards. So only 1/3 of the code is needed, but workspace members still have to be named individually.
even with a flag like --dependencies-only it'll be a bit inconvenient to individually add all the Cargo files to Docker for this to work
(And, if Docker ever decided to introduce a convenient way to add files by pattern but keep their structure, then it'd be really easy: COPY **/Cargo.toml. But it can't be solved by Cargo alone).
Concretely the Dockerfile would look like this:
COPY Cargo.toml ./Cargo.toml
COPY Cargo.lock ./Cargo.lock
COPY sub_one/Cargo.toml sub_one/Cargo.toml
COPY sub_two/Cargo.toml sub_two/Cargo.toml
COPY sub_three/Cargo.toml sub_three/Cargo.toml
COPY sub_four/Cargo.toml sub_four/Cargo.toml
RUN cargo build --dependencies--only
COPY . .
RUN cargo build
Which is better than
COPY Cargo.toml ./Cargo.toml
COPY Cargo.lock ./Cargo.lock
COPY sub_one/Cargo.toml sub_one/Cargo.toml
COPY sub_two/Cargo.toml sub_two/Cargo.toml
COPY sub_three/Cargo.toml sub_three/Cargo.toml
COPY sub_four/Cargo.toml sub_four/Cargo.toml
RUN printf "fn main() {}" > sub_one/main.rs && \
printf "" > sub_one/lib.rs && \
printf "" > sub_one/lib.rs && \
printf "fn main() {}" > sub_one/main.rs && \
cargo build && \
rm sub_one/main.rs sub_two/lib.rs sub_three/lib.rs sub_four/main.rs
COPY . .
RUN cargo build
but not ideal.
EDIT: build.rs must also be 'mocked' in the same way, if present. Build files need to contain fn main() {}(and maybe println!("cargo:rerun-if-changed=build.rs");?)
This is going a bit farther in the woods, but I wonder if it would help if cargo would be able to export a build plan and execute it later, independently of the project files:
cargo build --dependecies-only --export plan1234.toml
And then in Dockerfile:
COPY plan1234.toml .
RUN cargo build --import plan1234.toml
COPY . .
RUN cargo build
Hm, I think that still has the problem of the context in which you run the
first command: when the dependencies do change, you still want the
dockerfile to do the right thing. Though I guess if --export is a
significantly cheaper "dry-run" that doesn't try to actually compile
anything, the resulting lockfile might be useful as a cache key?
On Tuesday, 19 March 2019, Pyry Kontio notifications@github.com wrote:
This is going a bit farther in the woods, but I wonder if it would help if
cargo would be able to export a build plan and execute is:cargo build --dependecies-only --export plan1234.toml
And then in Dockerfile:
COPY plan1234.toml .
RUN cargo build --import plan1234.toml
COPY . .
RUN cargo build—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-474656063,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABxXhpJslMj6Ku51WK7yAt14htxpbRTCks5vYZjqgaJpZM4IXR7U
.
to export a build plan and execute it later
I was looking into something in a similar vein. Since Docker can import a tarball and automatically extract it, one could create a Cargo subcommand that creates the skeleton workspace and tars it up. Something like
cargo docker-plan -o plan.tar
ADD plan.tar
RUN cargo build
ADD ...source-files...
RUN cargo build
However, this feels unsatisfying as there's no automatic way to cause Docker to regenerate that plan.tar.
I'm not really excited about a solution that has to be done outside Dockerfile. It's a confusing workflow for other people or my future self. I don't think Docker will add a way to create a tar file as part of the build, as builds should not affect the host OS.
It it my belief that some change on the Docker side is needed for Cargo workspaces to work well with it - it can't be solved entirely on the Cargo side. My personal favourite is a structure-preserving copy command for Docker (15858, 29211): CP **/*.toml. Don't hold your breath though...
That said, --dependencies-only would
Is it possible to synthesize CP **.toml with something like
FROM debian:stable-slim as toml
COPY . .
RUN find . -type f -not -name '*.toml' -delete
FROM rust:slim AS builder
COPY --from=toml . .
COPY ./Cargo.lock .
{...etc}
We discussed this in today's Cargo meeting. We would like to have some better support for Docker in particular, and there are a few other use cases for building just the dependencies of a project (e.g., the RLS did this for a while, or preparing a lab exercise). However, it seems that for all these use cases, just building deps is necessary but not sufficient and therefore just adding a flag is likely not to be the best solution.
Focusing on the Docker use case, I think that an ideal solution would be a third-party custom sub-command, cargo docker or something similar. However, we are missing a lot of detail to know if this is possible and what it would look like.
So we would like to know more specifics about the use case - what are the common tasks and workflows? How would we need to interact with Docker? How would users want to customise the tool? Given these requirements, is there anything that prevents cargo docker being built today? (There is API exposed by Cargo today to do a deps-only build, but it would mean linking against its own version of Cargo, which is sub-optimal).
@nrc Maybe it's good to try and put into words the goal, rather than the method of getting there.
Goal:
For me this is all about making efficient use of docker layers. I want to use the layers to speed up my rust builds so that a small source code change doesn't require a full rebuild. I have no problems with incurring the full rebuild if I change my Cargo.toml deps. I want this to work for both workspace and standalone.
Method in abstract:
Conceptually you achieve this in docker by:
ADD <dependency spec files>
RUN <build dependencies>
ADD <source files>
RUN <build final output>
As long as the files in row 1 are unchanged, Docker uses the cached layer of row 2. If I introduce a change in row 3, then docker will run row 4.
Echoing what @lolgesten said, this is the exact workflow I use. The only thing I'd like to add to that is that this seems to have become a very standard workflow in Dockerfiles for projects in any language with dependency management: I use that same pattern in Dockerized projects for Ruby, NodeJS, Go, Elixir... some variation of those exact same 4 steps:
| Ecosystem | Dep files for first COPY | Install and/or build deps only |
|-----------|--------------------------------------|------------------------------------|
| Ruby | Gemfile, Gemfile.lock | bundle install --deployment |
| Node | package.json, package-lock.json | npm ci |
| Go | go.mod, go.sum | go mod download |
| Elixir | mix.exs, mix.lock | mix deps.get, mix deps.compile |
| Rust | Cargo.toml, Cargo.lock | N/A |
For me personally, cargo build --deps-only et al. does seem it could support that general-use use case without being _too_ Docker specific. _However, if Cargo was to adopt something different that is more clever/bespoke/unique, I'd desire for that solution to be sufficiently advantageous to overcome the additional cognitive overhead / discoverability of having to treat Rust projects differently than what I do most everywhere else._
@nrc With regards to a third-party cargo sub-command, the concern for me would be that this is now something you have to take steps to pre-install in your Dockerfile, rather than just having things work out of the box with a default official Rust docker image. This would then likely lead to a number of unofficial preconfigured rust-with-docker-cargo type images that community folks maintain and keep up to date, that an end-user would either have to discover, and/or copypasta some steps they found elsewhere to get things going. Not necessarily a problem, but certainly less "batteries included" than being able to do things with the official image.
@nrc Can you elaborate more on this?
However, it seems that for all these use cases, just building deps is necessary but not sufficient and therefore just adding a flag is likely not to be the best solution.
I am not sure where this idea came from, given the comments so far on this thread. What problems have been brought up that wouldn't be solved with --dependencies-only, and are they not orthogonal enough to be solved separately?
[edit] a little more:
Because honestly, this seems kind of dubious:
Focusing on the Docker use case, I think that an ideal solution would be a third-party custom sub-command, cargo docker or something similar. However, we are missing a lot of detail to know if this is possible and what it would look like.
We have a proposal for a simple solution that would solve a common use case. I'm worried about this. "We want to make a cargo docker subcommand instead of just implementing the simple solution, but we don't know what it should do yet." At least without hearing the use cases that you're already thinking of, it sounds like over-engineering at this stage.
At least without hearing the use cases that you're already thinking of, it sounds like over-engineering at this stage.
Adding on to @radix - not only that, but the solution would only be applicable for Docker. Any other piece of software that would need to build a project in two stages as well would need a re-implementation. I would argue instead that have the ability to build dependencies separate from the application(s) themselves would enable more composability for future needs as well as satisfy the current one.
On Thursday, 21 March 2019, Christopher Armstrong notifications@github.com
wrote:
@nrc https://github.com/nrc Can you elaborate more on this?
However, it seems that for all these use cases, just building deps is
necessary but not sufficient and therefore just adding a flag is likely not
to be the best solution.I am not sure where this idea came from, given the comments so far on this
thread. What problems have been brought up that wouldn't be solved with
--dependencies-only, and are they not orthogonal enough to be solved
separately?
There is still the "CP **.toml" workspaces problem, which could
hypothetically have a
cargo extract-workspace-configs dest/
command to solve it similar to the build-plan.tar idea; but it could only
really replace the "RUN find -delete" line in my proposed staging solution,
so having the Dockerfile explicitly know about *.toml files instead seems
honestly fine.
Regardless, it's definitely both a separate issue and not docker-specific:
I got linked this discussion from Nix land, which has a very similar "build
one step, hash the entire filesystem after that step, if the hash matches
you can skip ahead" execution model.
(I do get the impression the "but not sufficient" example may have come up
verbally in the meeting, which leads us back to "super curious what the
additional problems are" if there's anything other than workspaces)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-475293173,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABxXhk5RjTzoOp25X0R9BP03pEfglz0Mks5vY619gaJpZM4IXR7U
.
Focusing on the Docker use case, I think that an ideal solution would be a third-party custom sub-command, cargo docker or something similar. However, we are missing a lot of detail to know if this is possible and what it would look like.
@nrc you may want to take a look at Haskell's stack tool integration with docker (or nix!). It is transparent in that you just configure it to use docker and no sub command is needed. As mentioned, previously, they do have a build --only-dependencies command. Although I am not up to date on whether there is a better workflow now for those being discussed here.
Hm, there seems to be an important inversion-of-control difference here of
"cargo(/stack) runs docker to do a build" vs "a docker build runs cargo".
Like, the former requires you to have rust and cargo installed directly on
each machine you're building on, at the correct version etc; which will
sometimes be the case, but does seem like a different environment than the
"top-level Dockerfile(which may in turn be part of a docker-compose or
such)" one originally mentioned.
On Fri, Mar 22, 2019 at 11:16 AM Greg Weber notifications@github.com
wrote:
Focusing on the Docker use case, I think that an ideal solution would be a
third-party custom sub-command, cargo docker or something similar. However,
we are missing a lot of detail to know if this is possible and what it
would look like.@nrc https://github.com/nrc you may want to take a look at Haskell's
stack tool integration with docker (or nix!). It is transparent in that you
just configure it to use docker and no sub command is needed. As mentioned,
previously, they do have a build --only-dependencies command. Although I
am not up to date on whether there is a better workflow now for those being
discussed here.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-475727075,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABxXhrrZUROGhqgOw-WoJiS5wzlEqsM5ks5vZR38gaJpZM4IXR7U
.
@ohAitch if you are doing a build in docker driven by cargo, then you still don't need your compiler (rustc) installed: that would be in the docker image used. In stack, you essentially specify the compiler version, and if you enable docker it will pull a Linux image that has that version. If not, I should note that stack doesn't require you to have the Haskell compiler installed on your machine in any case: it will automatically download and use the version you have specified in your stack.yaml.
I agree that there are situations where one might prefer to have Docker drive cargo, for example a CI build environment that assumes that.
To show the scope of the problem for rust docker using the solutions above, this person was able to reduce rebuild time by 90%, and build size by 95%
At PingCAP we are wanting this too, for the docker use case.
If a --deps-only style flag is not accepted into cargo, on a casual glance I think a cargo plugin could accomplish the same thing.
Since individual deps can be built with the -p flag, like cargo build -p failure, it is almost trivial to read the deps and issue all the individual build commands, wrapped up in a nice cargo build-deps command.
Is there anything obviously wrong with the idea?
@siddontang also had the idea to use the --exclude flag to exclude all the local packages, like cargo build --all --exclude tikv --exclude --fuzzer-afl etc. which seems relatively easy, and again could be done in a plugin.
I admit though I don't understand the issues with docker layers, and why the workarounds all require building a "fake" source tree.
@brson The problem with Docker layers is of cache invalidation. You want to separate the layer of seldom changing artefacts (dependencies) from oft-changing artefacts (the root crate) so that you can invalidate just the oft-changing ones without having to re-build everything. If you include the project files to the same layer with dependencies, you'll end trashing your cache constantly.
Somewhat similar to the docker use case, we at Standard are looking to make cargo play nicely with our build system sandbox which does not allow network access during build.
In order to avoid unnecessary work we decided to approach this problem by implementing a solution which essentially vendors the dependencies. However, if the only thing vendored are sources for dependencies, then all the crates in the dependency tree would have to be rebuilt every build. To avoid the rebuilds, we could "vendor" the already-built dependencies (essentially caching the build output directory), however without cargo build --dependencies-only that is difficult to achieve.
A while back I commented an example of a Docker workflow without a --dependencies-only flag.
I'd like to note that I've found out that I forgot about build.rs in that example. If there is a build.rs, that also needs to be 'mocked' along with lib.rs and main.rs.
@brson I've just found this, looks a lot like what you've proposed: https://github.com/nacardin/cargo-build-deps
EDIT: it has appeared in this discussion before, and there's been some feedback already.
I found this because i had a Dockerfile that always produced a hello world. rm main.rs as in some examples didn't help either. I now added a touch main.rs before the second build to avoid the problem that the first build is run after the original main.rs was modified.
Which interestingly seemed to work for a some time because the first build was cached and thus the main.rs modification time was later.
Does anyone know if this is an issue that is currently being discussed or worked on?
Maybe a status update? Maybe this is fixed by a peripheral tool and I just didn't notice?
I'm not aware of any progress.
This can also be useful to optimize docker build.
Is there a decision or a guide on how to solve this in the meantime in order to have smaller images and not do a build on every change?
Is there a decision or a guide on how to solve this in the meantime in order to have smaller images and not do a build on every change?
@mghz
COPY ./Cargo.toml ./Cargo.lock ./
RUN mkdir src/
RUN echo "fn main() { }" > src/main.rs
RUN cargo build --release
RUN rm ./target/release/deps/project_name*
COPY . .
RUN cargo build --release
Don't forget to change project_name accordingly.
@KalitaAlexey wrote:
Why do you want it?
Because when building on continuous integration server I want to split the whole thing into a reasonable number of steps. This improves readability: I know which step is failing at the first glance, and the output from given step contains less clutter.
Sorry for hijacking the conversation.
While I agree this feature is worth implementing for the simplicity that it'll provide, at least for the CI caching purpose that lots of people are talking about here, wouldn't it better to use sccache now in this case? Although the Rust support is still experimental, it's been much more mature since this issue was opened in 2016. And compared with Docker which ditches the whole cache layer if Cargo.toml/lock is modified, sccache is capable of incrementally compiling the dependencies, so you don't have to wait for everything to be rebuilt after a cargo add or cargo update.
Another benefit of using sccache is that it's got various storage options. Once the local storage grows too large and your CI cache downloading/uploading gets slowed down, it's basically hassle-less to migrate to S3/Redis/etc. Not to mention that it's got distributed compilation now which might be useful for CI/CD.
I hope there could be a solution for workspace repos too. Use case:
ADD Cargo.* /app/
ADD rust/display-sim-build-tools/Cargo.* rust/display-sim-build-tools/
ADD rust/display-sim-core/Cargo.* rust/display-sim-core/
ADD rust/display-sim-native-sdl2/Cargo.* rust/display-sim-native-sdl2/
ADD rust/display-sim-opengl-render/Cargo.* rust/display-sim-opengl-render/
ADD rust/display-sim-stub-render/Cargo.* rust/display-sim-stub-render/
ADD rust/display-sim-testing/Cargo.* rust/display-sim-testing/
ADD rust/display-sim-web-error/Cargo.* rust/display-sim-web-error/
ADD rust/display-sim-web-exports/Cargo.* rust/display-sim-web-exports/
ADD rust/display-sim-webgl-render/Cargo.* rust/display-sim-webgl-render/
ADD rust/enum-len-derive/Cargo.* rust/enum-len-derive/
ADD rust/enum-len-trait/Cargo.* rust/enum-len-trait/
RUN echo "Faking project structure." \
&& mkdir -p rust/display-sim-build-tools/src && touch rust/display-sim-build-tools/src/lib.rs \
&& mkdir -p rust/display-sim-core/src && touch rust/display-sim-core/src/lib.rs \
&& mkdir -p rust/display-sim-native-sdl2/src && touch rust/display-sim-native-sdl2/src/lib.rs \
&& mkdir -p rust/display-sim-opengl-render/src && touch rust/display-sim-opengl-render/src/lib.rs \
&& mkdir -p rust/display-sim-stub-render/src && touch rust/display-sim-stub-render/src/lib.rs \
&& mkdir -p rust/display-sim-testing/src && touch rust/display-sim-testing/src/lib.rs \
&& mkdir -p rust/display-sim-web-error/src && touch rust/display-sim-web-error/src/lib.rs \
&& mkdir -p rust/display-sim-web-exports/src && touch rust/display-sim-web-exports/src/lib.rs \
&& mkdir -p rust/display-sim-webgl-render/src && touch rust/display-sim-webgl-render/src/lib.rs \
&& mkdir -p rust/enum-len-derive/src && touch rust/enum-len-derive/src/lib.rs \
&& mkdir -p rust/enum-len-trait/src && touch rust/enum-len-trait/src/lib.rs \
&& mkdir -p rust/display-sim-testing/benches && touch rust/display-sim-testing/benches/whole_sim_bench.rs \
&& echo "fn main() { }" > rust/display-sim-wasm.rs && echo "fn main() { }" > rust/display-sim-default-entrypoint.rs \
&& cargo fetch && cargo --offline build --release
ADD rust/ /app/rust/
RUN cargo --offline test --all \
&& cargo --offline build --release \
&& cargo --offline clean
See up there, that I was forced to create the whole folder structure, with its lib files, bin files and bench files.
My personal take on this problem would be to propose a new sort of lock file that contains the data of all the dependencies of all the workspace members, and a new build option that considers only that single file as input, thus precompiling all the workspace dependencies.
After that, you would be good for using cargo --offline, you would be able to add code changes without having to recompile dependencies in docker builds, and you wouldn't need to mantain those lines in Dockerfile which is quite error prone.
Just my two cents, the idea of adding this as a sub-command to cargo (similar to how stack does it) is annoying in at least one way: it means that if I want to do builds using different languages/compilers on the same machine I now have to pre-install all those tools (in addition to Docker) to the build machine. Ideally I'd like to have the build machinery (cargo, rustc, etc.) as part of the throw-away container and know that I can get a (more or less) identical build on a different machine without having to check if the build tool versions are the same.
Also it means that if the CI tool you're using already supports Docker you don't have to worry about it also supporting cargo.
If it's a cargo sub-command, it would want to be a subcommand within the throw-away container, I expect.
Replying to @theypsilon 's https://github.com/rust-lang/cargo/issues/2644#issuecomment-537637178 , here is a shorter example that I think does about the same thing. I'm new to rust, so correct me if I've misused cargo. It's a similar pattern I've used before to work around docker's current copy limitations https://github.com/moby/moby/issues/15858#issuecomment-532016362 . Though this approach is more flexible given arbitrary scripts can be used to condition what breaks the docker cache layer:
# Add prior stage to cache/copy from
FROM rust AS cache
WORKDIR /tmp
# Copy from build context
COPY ./ ./ws
# Filter or glob files to cache upon
RUN mkdir ./cache && cd ./ws && \
find ./ -name "Cargo.*" -o \
-name "lib.rs" -o -name "main.rs" | \
xargs cp --parents -t ../cache && \
cd ../cache && \
find ./ -name "lib.rs" | \
xargs -I% sh -c "echo > %" && \
find ./ -name "main.rs" | \
xargs -I% sh -c "echo 'fn main() { }' > %"
# Continue with build stage
FROM rust as build
WORKDIR /ws
# copy workspace dependency metadata
COPY --from=cache /tmp/cache ./
# fetch and build dependencies
RUN cargo fetch && cargo build
# copy rest of the workspace code
COPY --from=cache /tmp/ws ./
# build code with cached dependencies
RUN cargo build
You may also want to explicitly ignore some files and folders using the .dockerignore, e.g. like to avoid copying your large local target folders into the rest of the docker daemon's build context.
The lack of this is incredibly painful for docker builds and it seems (on the surface) to be a relatively simple thing to add to cargo. As per https://github.com/rust-lang/cargo/issues/2644#issuecomment-537637178, copying Cargo.toml around doesn't work if using the workspaces feature. To add to that, it also doesn't work if the Cargo.toml has a [[bin]] specified, because then Cargo errors out when it can't find a file at bin.path.
Personally, I think that Docker should do a better job of enabling build cache tools. But I also believe that this can be fixed more quickly from Cargo's side by adding a --dependencies-only flag.
If I added this to cargo, what are the chances that the patch would be accepted? What are the considerations _against_ adding this flag?
wouldn't it better to use sccache now in this case?
How do you imagine that sccache can be integrated into a docker build? You can't mount volumes or access the network during a docker build; the only option for getting data into the build is to copy the files (and there is no option for getting the data out of a build step, so no way to update sccache).
if you're using dockerkit (DOCKERKIT=1) you can try:
WORKDIR APP
RUN --mount=type=cache,id=cargo,target=/root/.cargo --mount=type=cache,id=target,target=/app/target cargo build --out-dir /app/build
I've not tried but this yet should keep downloaded and compiled packages from downloading / compiling again
if you're using dockerkit (
DOCKERKIT=1) you can try:WORKDIR APP RUN --mount=type=cache,id=cargo,target=/root/.cargo --mount=type=cache,id=target,target=/app/target cargo build --out-dir /app/buildI've not tried but this yet should keep downloaded and compiled packages from downloading / compiling again
Thanks, mathroc! I assume you mean DOCKER_BUILDKIT=1? I do think this is now the best way to solve the issue, but it took me a loooong time to go from concept to reality on this.
I learned about DOCKER_BUILDKIT soon after posting my previous comment, and I've pretty much been banging my head against my desk trying to get it to work for the last two days, at least 16 hours of continual banging.
But now, it does work!
The caching incantation is:
RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo \
--mount=type=cache,target=target \
cargo build
I encountered multiple issues before it actually worked.
First, the rust image sets CARGO_HOME to /usr/local/cargo, so of course mounting /root/.cargo does nothing in that base image.
Second, --mount=type=cache,target=${CARGO_HOME} produces a fresh volume on every build, so you have to manually expand that to /usr/local/cargo. This seems like just due to immaturity in the buildkit backend.
Third, you can't cache just the parts of $CARGO_HOME suggested here, or else you'll get a fingerprinting issue and the target/ artifacts will rebuild even if you've exposed the previous artifacts through the second cache mount.
Oof, that took a lot of time to figure out. But it's so, so worth it.
I just skimmed this thread and didn't see https://github.com/denzp/cargo-wharf mentioned. That seems to be the correct approach for improved caching with docker.
wouldn't it better to use sccache now in this case?
How do you imagine that sccache can be integrated into a docker build? You can't mount volumes or access the network during a docker build; the only option for getting data into the build is to copy the files (and there is no option for getting the data _out_ of a build step, so no way to update sccache).
Of course you can access the network in a Docker build. That's how packages and stuff are installed.
Here's a full (!), working example using sccache. For multiple binaries/libraries, you'd have to create multiple dummy lib.rs/main.rs files.
Dockerfile:
FROM rust:1.37.0-buster as builder
WORKDIR /usr/src
RUN mkdir server
WORKDIR /usr/src/server
# Build dependencies as separately cached Docker step
# (not built into cargo: https://github.com/rust-lang/cargo/issues/2644)
RUN apt-get update \
&& apt-get install -y --no-install-recommends --no-install-suggests \
cmake=3.13.4-1 $(: "Needed for grpcio-sys build") \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
ARG USE_SCCACHE
ARG SCCACHE_VERSION
ENV SCCACHE_VERSION=${SCCACHE_VERSION:-0.2.12}
RUN \
if [ "${USE_SCCACHE}" = 1 ]; then \
set -x ;curl -fsSLo /tmp/sccache.tgz "https://github.com/mozilla/sccache/releases/download/${SCCACHE_VERSION}/sccache-${SCCACHE_VERSION}-x86_64-unknown-linux-musl.tar.gz" \
&& mkdir /tmp/sccache \
&& tar -xzf /tmp/sccache.tgz -C /tmp/sccache --strip-components 1; \
fi
COPY server/Cargo.lock server/Cargo.toml ./
RUN sed -i'' 's/build = "build.rs"/## no build.rs yet/' Cargo.toml
RUN mkdir src && echo 'fn main(){println!("Dummy");}' >src/main.rs
ARG SCCACHE_REDIS
RUN if [ "${USE_SCCACHE}" = 1 ]; then export RUSTC_WRAPPER=/tmp/sccache/sccache; fi \
&& cargo build --release
# keep dummy main.rs for later
RUN rm Cargo.toml
# Application dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends --no-install-suggests \
cmake=3.13.4-1 \
protobuf-compiler \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Build build.rs dependencies (such as protoc-grpcio) and generate code from protos
COPY server/build.rs server/Cargo.toml ./
RUN mkdir ../proto
COPY proto/ ../proto/
RUN mkdir src/grpc
RUN if [ "${USE_SCCACHE}" = 1 ]; then export RUSTC_WRAPPER=/tmp/sccache/sccache; fi \
&& cargo build --release
RUN rm src/main.rs
COPY server/ ./
# Enforce re-building the source file
RUN touch src/main.rs
RUN if [ "${USE_SCCACHE}" = 1 ]; then export RUSTC_WRAPPER=/tmp/sccache/sccache; fi \
&& cargo build --release
FROM debian:buster-20190812
ENV DEBIAN_FRONTEND=noninteractive
# hadolint ignore=DL3008,SC2006,SC2046
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
libssl-dev \
\
`: "Install debugging tools"` \
curl \
net-tools \
vim-tiny \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
COPY --from=builder /usr/src/server/target/release/supercoolapp /supercoolapp
CMD ["/supercoolapp"]
And in reasonable places of my bashrc:
# Start Redis LRU cache (make this run only once, e.g. through an alias!)
pkill redis-server && sleep 5; redis-server --protected-mode no --dir ~/ --dbfilename .sccache-redis-dump.rdb --save 900 1 --maxmemory 200mb --maxmemory-policy allkeys-lru
# Set sccache Redis host to my laptop's IP which is reachable via local Docker build
SCCACHE_REDIS="redis://$(ifconfig en0 | grep "inet " | awk '{print $2}')"
export SCCACHE_REDIS
This should only differ by the IP/host of the Redis server.
Thanks, mathroc! I assume you mean DOCKER_BUILDKIT=1? I do think this is now the best way to solve the issue, but it took me a loooong time to go from concept to reality on this.
@masonk, yes, that’s the one :+1:
thx for doing the headbanging and for sharing the results !
Second,
--mount=type=cache,target=${CARGO_HOME}produces a fresh volume on every build, so you have to manually expand that to/usr/local/cargo. This seems like just due to immaturity in the buildkit backend.
adding id=something to the --mount might solve that. I'm really unsure, again I’ve not tried yet
if you're using dockerkit (
DOCKERKIT=1) you can try:WORKDIR APP RUN --mount=type=cache,id=cargo,target=/root/.cargo --mount=type=cache,id=target,target=/app/target cargo build --out-dir /app/buildI've not tried but this yet should keep downloaded and compiled packages from downloading / compiling again
Thanks, mathroc! I assume you mean DOCKER_BUILDKIT=1? I do think this is now the best way to solve the issue, but it took me a loooong time to go from concept to reality on this.
I learned about DOCKER_BUILDKIT soon after posting my previous comment, and I've pretty much been banging my head against my desk trying to get it to work for the last two days, at least 16 hours of continual banging.
But now, it does work!
The caching incantation is:
RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo \ --mount=type=cache,target=target \ cargo buildI encountered multiple issues before it actually worked.
First, the rust image sets CARGO_HOME to
/usr/local/cargo, so of course mounting/root/.cargodoes nothing in that base image.Second,
--mount=type=cache,target=${CARGO_HOME}produces a fresh volume on every build, so you have to manually expand that to/usr/local/cargo. This seems like just due to immaturity in the buildkit backend.Third, you can't cache just the parts of $CARGO_HOME suggested here, or else you'll get a fingerprinting issue and the target/ artifacts will rebuild even if you've exposed the previous artifacts through the second cache mount.
Oof, that took a lot of time to figure out. But it's so, so worth it.
Incase someone is wondering how to make the cache working with DOCKER_BUILDKIT, see this example here: https://stackoverflow.com/a/55153182/2302437
Specifically:
# syntax=docker/dockerfile:experimentalRUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo \
--mount=type=cache,target=target \
cargo build --releaseJust a note, I just spent over an hour for debugging something that was very stupid little thing in the end: needing to do touch src/main.rs to ensure a fresh build, instead of the "placeholder main.rs" getting build, as mentioned earlier in this thread.
I think something should be done for this issue.
I think it's even questionable that Cargo apparently checks only if the file is newer than the old one. It should just check if has a different timestamp, and even do more involved fingerprinting (a quick checksum) for the main file.
I am trying to get a minimal version of this to work in accordance with @benmarten's comment.
This is my project:
❯ tree project
project
├── Cargo.toml
└── src
└── lib.rs
This is my Cargo.toml:
[package]
name = "project"
version = "0.1.0"
authors = ["Ethan Brooks <[email protected]>"]
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
rand = "0.7.3"
This is my Dockerfile:
# syntax=docker/dockerfile:experimental
FROM rustlang/rust:nightly
COPY project/ /root/project/
RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo
\ --mount=type=cache,target=target \ cargo build --manifest-path /root/project/Cargo.toml
When I run I get
❯ DOCKER_BUILDKIT=1 docker build -t project .
[+] Building 5.1s (4/4) FINISHED
=> [internal] load .dockerignore 0.0s
=> => transferring context: 34B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 312B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 4.8s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:787107d7f7953cb2d95e 0.0s
failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: rpc error: code = Unknown desc = Dockerfile parse error line 6: unknown instruction: \
@ethanabrooks your Dockerfile has a syntax error, and it's mentioned at the very end of the last line of output. Your line begins with a backslash \. It looks like that backslash should actually be at the end of the previous line.
Right. Thanks for the correction. The working version of the Dockerfile is
# syntax=docker/dockerfile:experimental
FROM rustlang/rust:nightly
COPY project/ /root/project/
RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/cargo \
--mount=type=cache,target=target \
cargo build --manifest-path /root/project/Cargo.toml
The problem that I am having is that if I make a change to src/lib.rs my Dockerfile still rebuilds the rand dependency. Or at least it appears to:
❯ DOCKER_BUILDKIT=1 docker build -t project .
[+] Building 5.1s (9/10)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 34B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 0.7s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:787107d7f7953cb2d95e 0.0s
=> [internal] load metadata for docker.io/rustlang/rust:nightly 0.0s
=> CACHED [stage-0 1/3] FROM docker.io/rustlang/rust:nightly 0.0s
=> CACHED FROM docker.io/library/rust:latest 0.0s
=> => resolve docker.io/library/rust:latest 0.3s
=> [internal] load build context 0.0s
=> => transferring context: 321B 0.0s
=> [stage-0 2/3] COPY project/ /root/project/ 0.0s
=> [stage-0 3/3] RUN --mount=type=cache,target=/usr/local/cargo,from=rust,source=/usr/local/c 4.1s
=> => # Compiling getrandom v0.1.14
=> => # Compiling cfg-if v0.1.10
=> => # Compiling ppv-lite86 v0.2.6
=> => # Compiling rand_core v0.5.1
=> => # Compiling rand_chacha v0.2.2
Hi is there any progress with this issue? @alexcrichton @ehuss We should focus on permanent solutions and not workarounds presented in this thread. This issue is huge blocker and as it's almost 4 years old sends negative message to anyone wanted to start with rust these days. Is there any plan / roadmap? Can I help somehow? Is there anything which is blocking this issue to be resolved?
@elmariofredo It's not clear to me what you are asking for. It seems like this issue is mostly focused on "how to pre-cache in Docker", and AFAIK there isn't a single clear answer to that. Are you trying to cache with Docker? Does Docker Buildkit not work for you?
You can help by creating a project that solves the problem. It looks like several people have done that, but perhaps they don't fit your needs? It seems unlikely there will be a one-size-fits-all solution, but if there is general guidance, some documentation on how to properly use Docker sounds useful.
AFAICT, the original request of a --dependencies-only flag wouldn't help with Docker. If it would, it would help to write a clear design document of how that would work.
@ehuss I don't agree with this. A --dependencies-only would solve 99% of all docker caching issues.
This would be the minimal possible docker, if we had some way of expressing it.
FROM rust:latest
COPY Cargo.toml Cargo.lock /mything
RUN cargo build-deps --release # creates a layer that is cached
COPY src /mything/src
RUN cargo build --release # only rebuild this when src files changes
CMD ./target/release/mything
And it would work great with docker layer caching since on src changes, only the last RUN would have to run.
AFAICT, the original request of a
--dependencies-onlyflag wouldn't help with Docker. If it would, it would help to write a clear design document of how that would work.
It would help with docker's layer caching
It's also relevant for pre-caching for packaging. For instance, to package a .deb, we need to prepare a complete source tree to package and ship it as "source package", and do a dedicated build step to compose a "binary package". FYI, debian is currently packaging things differently, with custom logic to prepare the source packages, but, in general, it'd be extremely useful to be able to pre-download dependencies. Docker packaging is not the only use case.
To answer the caching question:
Right now this is common:
Step 5/14 : COPY . .
---> b5f0526a3fdd
Step 6/14 : RUN cargo build --release
---> Running in 8a7eef036c01
Updating crates.io index
Downloading crates ...
Downloaded serde_json v1.0.48
Downloaded log v0.4.8
Downloaded rand v0.7.3
Downloaded failure v0.1.7
Downloaded serde v1.0.104
Downloaded reqwest v0.10.4
Downloaded lazy_static v1.4.0
[... a long time later ...]
Finished release [optimized] target(s) in 31m 37s
Removing intermediate container 8a7eef036c01
---> 4d6f1f8935fb
Now if we touch _any_ file and we ask docker to build a container cargo is going to 1) redownload the whole index 2) compile all dependencies:
Step 5/14 : COPY . .
---> 6427ed267359
Step 6/14 : RUN cargo build --release
---> Running in 8d9ca24ca7
Updating crates.io index
Downloading crates ...
Downloaded serde_json v1.0.48
Downloaded log v0.4.8
Downloaded rand v0.7.3
Downloaded failure v0.1.7
Downloaded serde v1.0.104
Downloaded reqwest v0.10.4
Downloaded lazy_static v1.4.0
[... a long time later ...]
Finished release [optimized] target(s) in 31m 37s
Removing intermediate container 8d9ca24ca7
---> 227d4df84225
Instead I'd like to do something like this:
Step 5/14 : COPY Cargo.toml Cargo.lock .
---> 6427ed267359
Step 6/14 : RUN cargo build --release --dependencies-only
---> Running in 8d9ca24ca7
Updating crates.io index
Downloading crates ...
Downloaded serde_json v1.0.48
Downloaded log v0.4.8
Downloaded rand v0.7.3
Downloaded failure v0.1.7
Downloaded serde v1.0.104
Downloaded reqwest v0.10.4
Downloaded lazy_static v1.4.0
[... a long time later ...]
Finished release [optimized] target(s) in 31m 37s
Removing intermediate container 8d9ca24ca7
---> 227d4df84225
Step 7/14 : COPY . .
---> 6427ed267359
Step 8/14 : RUN cargo build --release
---> Running in 8d9ca24ca7
Finished release [optimized] target(s) in 3m 37s
Removing intermediate container 8d9ca24ca7
---> 227d4df84225
And after a minor edit to src/lib.rs building the container should look like this:
Step 5/14 : COPY Cargo.toml Cargo.lock .
---> Using cache
---> 6427ed267359
Step 6/14 : RUN cargo build --release --dependencies-only
---> Using cache
---> 227d4df84225
Step 7/14 : COPY . .
---> 6427ed267359
Step 8/14 : RUN cargo build --release
---> Running in 8d9ca24ca7
Finished release [optimized] target(s) in 3m 37s
Removing intermediate container 8d9ca24ca7
---> 227d4df84225
Most of the dockerfiles posted to achieve this don't look very maintainable and pulling 3rd party binaries that implement this is not possible due to security policies in a lot of companies. This should "just work" with a stock rust docker image.
@ehuss I have tried to create Pre-RFC here https://internals.rust-lang.org/t/pre-rfc-allow-cargo-build-dependencies-only/11987 please let me know if you have any suggestions how to improve it it's my first Pre-RFC 😉
Thanks answers from @lolgesten, @NilsIrl, @MOZGIII and @kpcyrd given the speed and amount of response I see there is real need for this to be solved. Please feel free to post on Pre-RFC thread to support or add more information. Thanks!
It seems like with that design, any change to Cargo.toml or Cargo.lock would completely invalidate the cache. Is that not a problem for you? Most projects I've been involved with, those files change fairly regularly. How would it work with a workspace? How would it handle path dependencies?
@ehuss personally I would have no problem with Cargo.toml/lock changes causing a cache invalidation. That is perfectly symmetric with how changes to package.json/lock affects nodejs builds.
@ehuss
those files change fairly regularly.
How often we are talk about here? I mean cache mainly for local development with possible extension to CI ( but let's put that on side for now ) so during development process I don't expect to change dependencies on every source code change and that this my main goal. I understand that in very begging of every project devs are adding dependencies like crazy but also in the beginning there are fewer dependencies and thus build time is shorter. As project mature, changing/adding dependencies is less often needed.
How would it work with a workspace?
I would expect that --dependencies-only would install all dependencies no matter if it's workspace or single crate. Cache invalidation would then work for all crates. Can you please elaborate how workspace would change this?
How would it handle path dependencies?
Good question --dependencies-only should also download dependencies for path crates. For that these dependencies needs to be added using COPY prior to this step and it's up to them if they will decide to add whole dir or add Cargo.toml and Cargo.lock separately to leverage caching otherwise dependencies will get installed each time developer will change anything in path crate source code.
How often we are talk about here?
I'm thinking once or twice a week.
Would bind or volume mounts work for sharing the artifacts between runs?
I would expect that
--dependencies-onlywould install all dependencies no matter if it's workspace or single crate.
I mean, for a workspace, you would need to copy in all of the manifests into Docker to get the correct settings (features and such), replicating the same filesystem layout. Which dependencies are built in a workspace? Workspaces often have path dependencies between crates, so those probably wouldn't work. What about local patches? Those also would cause problems.
If it already needs to replicate the manifests, it is a small step to also create skeleton files for source files (empty lib.rs, main.rs, etc.). With those, I don't think there is a need for only building "dependencies".
Good question --dependencies-only should also download dependencies for path crates.
Many workspaces use path dependencies to split a package into multiple crates. I don't think it is viable to say they need to be copied entirely.
@ehuss
How often we are talk about here?
I'm thinking once or twice a week.
Would bind or volume mounts work for sharing the artifacts between runs?
Yes but there are two issues with that:
I would expect that --dependencies-only would install all dependencies no matter if it's workspace or single crate.
I mean, for a workspace, you would need to copy in all of the manifests into Docker to get the correct settings (features and such), replicating the same filesystem layout. Which dependencies are built in a workspace?
I have never used workspace feature so I would need some real examples to fully understand what you want to point out. From what I understand workspace is bundle of multiple crates most probably dependant on each other. You would need to COPY master Cargo.toml, Cargo.lock and Cargo.toml for each crate. You are right that this make setup bit complicated but it's up to each developer to decide if it's worth of adding additional COPY statement and enable caching.
Workspaces often have path dependencies between crates, so those probably wouldn't work.
Why they won't work? You will end up with whole workspace moved into docker build so you will end up with identical file structure as you have on your local env so addressing local dependency within workspace should work.
What about local patches? Those also would cause problems.
What do you mean? Changing src of installed dependencies? I guess that you should fetch dependency to workspace and threat is as any other local dependency and address it using path. Then it's just about adding this as another crate. Also you can just add COPY statement to override this dependency from docker build context.
If it already needs to replicate the manifests, it is a small step to also create skeleton files for source files (empty lib.rs, main.rs, etc.). With those, I don't think there is a need for only building "dependencies".
Why would you create lib.rs, main.rs when only thing you need is install dependencies? Yes I'm aware of workaround of creating skeleton files and building dummy application to download dependencies. I don't believe that this is final solution and should be taken only as workaround as it is make whole Dockerfile convoluted.
Good question --dependencies-only should also download dependencies for path crates.
Many workspaces use path dependencies to split a package into multiple crates. I don't think it is viable to say they need to be copied entirely.
Of course depends case by case then it's up to developers to create Dockerfile which fit their needs, but still you will need to install all dependencies. Do you think that would cause any issue?
Most of your question are around workspaces and path dependencies, I would suggest to start building list of cases with real examples to have better discussion about final implementation. WDYT?
Why they won't work? You will end up with whole workspace moved into docker build
It won't work in the sense that you are now copying the source files, which defeats the purpose of avoiding cache invalidation when the source changes.
What do you mean? Changing src of installed dependencies?
[patch] tables. I think a project that uses a patch with a path dependency would require copying the entirety of the path dependency (and any of its path dependencies) in order to build properly. That might be fundamentally unavoidable using this approach for Docker caching.
Why would you create lib.rs, main.rs when only thing you need is install dependencies?
Because it is a simpler solution. If there is a tool (and I think there are a few listed above), which requires one or two lines to prepare a skeleton project, that seems easier to use. Compared to requiring the user to replicate their manifest hierarchy, and making fundamental changes to Cargo to support the requested behavior.
building list of cases with real examples
Use cases and examples would be great! One example would be Cargo itself which has 5 different internal path dependencies. Ideally, one could build a docker cache layer with minimal work, without copying any of its source files.
Would bind or volume mounts work for sharing the artifacts between runs?
docker build doesn't support bind or volume mounts.
Why they won't work? You will end up with whole workspace moved into docker build
It won't work in the sense that you are now copying the source files, which defeats the purpose of avoiding cache invalidation when the source changes.
cargo build --dependencies-only could simply build every crate that is not part of the current workspace. Doing that would enable the example I mentioned above, as cargo would only need to look at the Cargo.toml/Cargo.lock files.
What do you mean? Changing src of installed dependencies?
[patch]tables. I think a project that uses a patch with apathdependency would require copying the entirety of thepathdependency (and any of its path dependencies) in order to build properly. That might be fundamentally unavoidable using this approach for Docker caching.
I see two options here and I think that usecase is so rare that we shouldn't let this block the other 99% that aren't using this feature:
--dependencies-only skip everything depending on [patch] entries--dependencies-only build the [patch] entries, which means you need to make sure they are in the image at that point in both cases we assume that the [patch] entry is a local path, because otherwise there's no issue at all.
Why would you create lib.rs, main.rs when only thing you need is install dependencies?
Because it is a simpler solution. If there is a tool (and I think there are a few listed above), which requires one or two lines to prepare a skeleton project, that seems easier to use. Compared to requiring the user to replicate their manifest hierarchy, and making fundamental changes to Cargo to support the requested behavior.
I've already addressed this. The policy about what goes into the official docker rust image is really strict and unofficial 3rd party base images is simply not an option in a lot of cases.
Creating the skeleton manually causes weird behavior in cargo. I'm going to post the current status of a project I'm working on. This is a larger, workspace based project, the Dockerfile for the average project would look simpler. Consider this Dockerfile:
FROM rust:alpine3.11
ENV RUSTFLAGS="-C target-feature=-crt-static"
WORKDIR /usr/src/foo
RUN apk add --no-cache musl-dev openssl-dev postgresql-dev
# build dependencies only
RUN mkdir -p api/src common/src smtp/src
RUN touch api/src/lib.rs common/src/lib.rs smtp/src/lib.rs
COPY Cargo.toml Cargo.lock ./
COPY api/Cargo.toml api/
COPY common/Cargo.toml common/
COPY smtp/Cargo.toml smtp/
RUN cargo build --release
# build project
COPY . .
# (work around cargo+docker bug)
RUN touch api/src/lib.rs common/src/lib.rs smtp/src/lib.rs
RUN cd api; cargo build --release
First, the explicit second touch is there because otherwise cargo doesn't invalidate it's internal cache for the src/lib.rs files correctly and you're getting very weird errors.
Second, running this produces the following output:
Step 1/21 : FROM rust:alpine3.11
---> 7e07363cacaf
Step 2/21 : ENV RUSTFLAGS="-C target-feature=-crt-static"
---> Using cache
---> 0ebefc6c3b2b
Step 3/21 : WORKDIR /usr/src/foo
---> Using cache
---> 53cbf7350018
Step 4/21 : RUN apk add --no-cache musl-dev openssl-dev postgresql-dev
---> Using cache
---> 5c5b8c186b33
Step 5/21 : RUN mkdir -p api/src common/src smtp/src
---> Using cache
---> 5a960c3e983a
Step 6/21 : RUN touch api/src/lib.rs common/src/lib.rs smtp/src/lib.rs
---> Using cache
---> 00d4b08900d4
Step 7/21 : COPY Cargo.toml Cargo.lock ./
---> e5d9c1e5e9ed
Step 8/21 : COPY api/Cargo.toml api/
---> 1324450afa1f
Step 9/21 : COPY common/Cargo.toml common/
---> fbacb54bd919
Step 10/21 : COPY smtp/Cargo.toml smtp/
---> bec0682f51be
Step 11/21 : RUN cargo build --release
---> Running in 399c11d14afa
Updating crates.io index
Downloading crates ...
Downloaded env_logger v0.7.1
Downloaded rand v0.7.3
Downloaded dotenv v0.15.0
Downloaded actix-rt v1.0.0
Downloaded failure v0.1.7
Downloaded log v0.4.8
Downloaded diesel v1.4.3
Downloaded diesel_migrations v1.4.0
Downloaded actix-files v0.2.1
Downloaded chrono v0.4.11
Downloaded reqwest v0.10.4
Downloaded lazy_static v1.4.0
Downloaded actix-web v2.0.0
Downloaded url v2.1.1
Downloaded publicsuffix v1.5.4
Downloaded handlebars v2.0.4
Downloaded r2d2 v0.8.8
Downloaded serde_json v1.0.48
Downloaded serde v1.0.104
Downloaded tokio v0.2.13
Downloaded bytes v0.5.4
Downloaded percent-encoding v2.1.0
Downloaded actix-router v0.2.4
Downloaded v_htmlescape v0.4.5
Downloaded actix-macros v0.1.1
Downloaded bitflags v1.2.1
Downloaded mime_guess v2.0.3
Downloaded derive_more v0.99.3
Downloaded itoa v0.4.5
Downloaded net2 v0.2.33
Downloaded actix-utils v1.0.6
Downloaded serde_derive v1.0.104
Downloaded copyless v0.1.4
Downloaded actix-testing v1.0.0
Downloaded futures v0.3.4
Downloaded mime v0.3.16
Downloaded regex v1.3.5
Downloaded migrations_internals v1.4.0
Downloaded rand_chacha v0.2.2
Downloaded backtrace v0.3.45
Downloaded rand_pcg v0.2.1
Downloaded actix-web-codegen v0.2.1
Downloaded humantime v1.3.0
Downloaded failure_derive v0.1.7
Downloaded http-body v0.3.1
Downloaded quick-error v1.2.3
Downloaded native-tls v0.2.4
Downloaded hyper-tls v0.4.1
Downloaded hyper v0.13.3
Downloaded getrandom v0.1.14
Downloaded libc v0.2.67
Downloaded actix-server v1.0.2
Downloaded mailparse v0.12.0
Downloaded cfg-if v0.1.10
Downloaded base64 v0.11.0
Downloaded pin-project-lite v0.1.4
Downloaded time v0.1.42
Downloaded fxhash v0.2.1
Downloaded awc v1.0.1
Downloaded tokio-tls v0.3.0
Downloaded diesel_derives v1.4.1
Downloaded byteorder v1.3.4
Downloaded matches v0.1.8
Downloaded idna v0.2.0
Downloaded walkdir v2.3.1
Downloaded pest_derive v2.1.0
Downloaded pq-sys v0.4.6
Downloaded actix-tls v1.0.0
Downloaded parking_lot v0.10.0
Downloaded num-traits v0.2.11
Downloaded num-integer v0.1.42
Downloaded actix-threadpool v0.3.1
Downloaded actix-codec v0.2.0
Downloaded hashbrown v0.5.0
Downloaded termcolor v1.1.0
Downloaded rand_core v0.5.1
Downloaded atty v0.2.14
Downloaded actix-service v1.0.5
Downloaded pin-project v0.4.8
Downloaded kuchiki v0.8.0
Downloaded fake v2.2.0
Downloaded pest v2.1.3
Downloaded futures-core v0.3.4
Downloaded error-chain v0.12.2
Downloaded scheduled-thread-pool v0.2.3
Downloaded ryu v1.0.3
Downloaded serde_urlencoded v0.6.1
Downloaded actix-http v1.0.1
Downloaded encoding_rs v0.8.22
Downloaded http v0.2.0
Downloaded futures-util v0.3.4
Downloaded migrations_macros v1.4.1
Downloaded actix-connect v1.0.2
Downloaded language-tags v0.2.2
Downloaded futures-channel v0.3.4
Downloaded rustc-demangle v0.1.16
Downloaded brotli2 v0.3.2
Downloaded futures-executor v0.3.4
Downloaded flate2 v1.0.13
Downloaded dtoa v0.4.5
Downloaded mio-uds v0.6.7
Downloaded mio v0.6.21
Downloaded syn v1.0.16
Downloaded either v1.5.3
Downloaded ucd-trie v0.1.3
Downloaded bytestring v0.1.4
Downloaded autocfg v1.0.0
Downloaded pin-project-internal v0.4.8
Downloaded futures-io v0.3.4
Downloaded threadpool v1.7.1
Downloaded aho-corasick v0.7.10
Downloaded same-file v1.0.6
Downloaded want v0.3.0
Downloaded cssparser v0.27.2
Downloaded openssl-probe v0.1.2
Downloaded tower-service v0.3.0
Downloaded base64 v0.10.1
Downloaded openssl-sys v0.9.54
Downloaded thread_local v1.0.1
Downloaded memchr v2.3.3
Downloaded unicase v2.6.0
Downloaded unicode-bidi v0.3.4
Downloaded unicode-normalization v0.1.12
Downloaded h2 v0.2.2
Downloaded openssl v0.10.28
Downloaded charset v0.1.2
Downloaded pest_generator v2.1.3
Downloaded quoted_printable v0.4.2
Downloaded parking_lot_core v0.7.0
Downloaded v_escape v0.7.4
Downloaded ppv-lite86 v0.2.6
Downloaded futures-sink v0.3.4
Downloaded regex-syntax v0.6.17
Downloaded tokio-util v0.2.0
Downloaded httparse v1.3.4
Downloaded signal-hook-registry v1.2.0
Downloaded fnv v1.0.6
Downloaded slab v0.4.2
Downloaded indexmap v1.3.2
Downloaded selectors v0.22.0
Downloaded lock_api v0.3.3
Downloaded version_check v0.9.1
Downloaded num_cpus v1.12.0
Downloaded synstructure v0.12.3
Downloaded sha1 v0.6.0
Downloaded proc-macro2 v1.0.9
Downloaded tokio-macros v0.2.5
Downloaded iovec v0.1.4
Downloaded futures-task v0.3.4
Downloaded quote v1.0.3
Downloaded backtrace-sys v0.1.34
Downloaded smallvec v1.2.0
Downloaded miniz_oxide v0.3.6
Downloaded crc32fast v1.2.0
Downloaded pin-utils v0.1.0-alpha.4
Downloaded try-lock v0.2.2
Downloaded cssparser-macros v0.6.0
Downloaded unicode-xid v0.2.0
Downloaded trust-dns-proto v0.18.0-alpha.2
Downloaded pest_meta v2.1.3
Downloaded brotli-sys v0.3.2
Downloaded foreign-types v0.3.2
Downloaded phf_codegen v0.8.0
Downloaded servo_arc v0.1.1
Downloaded arc-swap v0.4.4
Downloaded dtoa-short v0.3.2
Downloaded thin-slice v0.1.1
Downloaded trust-dns-resolver v0.18.0-alpha.2
Downloaded proc-macro-hack v0.5.11
Downloaded scopeguard v1.1.0
Downloaded proc-macro-nested v0.1.3
Downloaded futures-macro v0.3.4
Downloaded cc v1.0.50
Downloaded pkg-config v0.3.17
Downloaded v_escape_derive v0.5.6
Downloaded html5ever v0.25.1
Downloaded async-trait v0.1.24
Downloaded lru-cache v0.1.2
Downloaded nodrop v0.1.14
Downloaded socket2 v0.3.11
Downloaded enum-as-inner v0.3.2
Downloaded foreign-types-shared v0.1.1
Downloaded phf_generator v0.8.0
Downloaded stable_deref_trait v1.1.1
Downloaded phf v0.8.0
Downloaded phf_shared v0.8.0
Downloaded adler32 v1.0.4
Downloaded resolv-conf v0.6.3
Downloaded maplit v1.0.2
Downloaded precomputed-hash v0.1.1
Downloaded heck v0.3.1
Downloaded mac v0.1.1
Downloaded markup5ever v0.10.0
Downloaded nom v4.2.3
Downloaded linked-hash-map v0.5.2
Downloaded phf_macros v0.8.0
Downloaded hostname v0.3.1
Downloaded string_cache_codegen v0.5.1
Downloaded tendril v0.4.1
Downloaded string_cache v0.8.0
Downloaded unicode-segmentation v1.6.0
Downloaded siphasher v0.3.1
Downloaded version_check v0.1.5
Downloaded futf v0.1.4
Downloaded match_cfg v0.1.0
Downloaded utf-8 v0.7.5
Downloaded new_debug_unreachable v1.0.4
Compiling proc-macro2 v1.0.9
Compiling unicode-xid v0.2.0
Compiling libc v0.2.67
Compiling syn v1.0.16
Compiling cfg-if v0.1.10
Compiling log v0.4.8
Compiling memchr v2.3.3
Compiling lazy_static v1.4.0
Compiling smallvec v1.2.0
Compiling autocfg v1.0.0
Compiling cc v1.0.50
Compiling getrandom v0.1.14
Compiling futures-core v0.3.4
Compiling slab v0.4.2
Compiling bytes v0.5.4
Compiling futures-sink v0.3.4
Compiling serde v1.0.104
Compiling ppv-lite86 v0.2.6
Compiling fnv v1.0.6
Compiling bitflags v1.2.1
Compiling itoa v0.4.5
Compiling proc-macro-nested v0.1.3
Compiling arc-swap v0.4.4
Compiling pin-project-lite v0.1.4
Compiling futures-task v0.3.4
Compiling pin-utils v0.1.0-alpha.4
Compiling futures-io v0.3.4
Compiling scopeguard v1.1.0
Compiling matches v0.1.8
Compiling byteorder v1.3.4
Compiling siphasher v0.3.1
Compiling ryu v1.0.3
Compiling version_check v0.9.1
Compiling percent-encoding v2.1.0
Compiling failure_derive v0.1.7
Compiling rustc-demangle v0.1.16
Compiling dtoa v0.4.5
Compiling pkg-config v0.3.17
Compiling quick-error v1.2.3
Compiling encoding_rs v0.8.22
Compiling httparse v1.3.4
Compiling copyless v0.1.4
Compiling unicode-segmentation v1.6.0
Compiling either v1.5.3
Compiling regex-syntax v0.6.17
Compiling match_cfg v0.1.0
Compiling crc32fast v1.2.0
Compiling foreign-types-shared v0.1.1
Compiling version_check v0.1.5
Compiling linked-hash-map v0.5.2
Compiling mime v0.3.16
Compiling openssl v0.10.28
Compiling pq-sys v0.4.6
Compiling base64 v0.11.0
Compiling new_debug_unreachable v1.0.4
Compiling adler32 v1.0.4
Compiling native-tls v0.2.4
Compiling openssl-probe v0.1.2
Compiling ucd-trie v0.1.3
Compiling try-lock v0.2.2
Compiling mac v0.1.1
Compiling precomputed-hash v0.1.1
Compiling language-tags v0.2.2
Compiling utf-8 v0.7.5
Compiling maplit v1.0.2
Compiling sha1 v0.6.0
Compiling tower-service v0.3.0
Compiling stable_deref_trait v1.1.1
Compiling nodrop v0.1.14
Compiling v_htmlescape v0.4.5
Compiling same-file v1.0.6
Compiling termcolor v1.1.0
Compiling thin-slice v0.1.1
Compiling quoted_printable v0.4.2
Compiling dotenv v0.15.0
Compiling thread_local v1.0.1
Compiling unicode-normalization v0.1.12
Compiling futures-channel v0.3.4
Compiling bytestring v0.1.4
Compiling num-traits v0.2.11
Compiling num-integer v0.1.42
Compiling indexmap v1.3.2
Compiling http v0.2.0
Compiling lock_api v0.3.3
Compiling unicode-bidi v0.3.4
Compiling phf_shared v0.8.0
Compiling dtoa-short v0.3.2
Compiling unicase v2.6.0
Compiling error-chain v0.12.2
Compiling humantime v1.3.0
Compiling heck v0.3.1
Compiling backtrace-sys v0.1.34
Compiling brotli-sys v0.3.2
Compiling foreign-types v0.3.2
Compiling openssl-sys v0.9.54
Compiling lru-cache v0.1.2
Compiling nom v4.2.3
Compiling miniz_oxide v0.3.6
Compiling pest v2.1.3
Compiling futf v0.1.4
Compiling servo_arc v0.1.1
Compiling walkdir v2.3.1
Compiling idna v0.2.0
Compiling http-body v0.3.1
Compiling pest_meta v2.1.3
Compiling tendril v0.4.1
Compiling want v0.3.0
Compiling aho-corasick v0.7.10
Compiling iovec v0.1.4
Compiling net2 v0.2.33
Compiling signal-hook-registry v1.2.0
Compiling parking_lot_core v0.7.0
Compiling time v0.1.42
Compiling num_cpus v1.12.0
Compiling hostname v0.3.1
Compiling socket2 v0.3.11
Compiling atty v0.2.14
Compiling quote v1.0.3
Compiling url v2.1.1
Compiling fxhash v0.2.1
Compiling base64 v0.10.1
Compiling flate2 v1.0.13
Compiling regex v1.3.5
Compiling rand_core v0.5.1
Compiling mio v0.6.21
Compiling parking_lot v0.10.0
Compiling threadpool v1.7.1
Compiling resolv-conf v0.6.3
Compiling mime_guess v2.0.3
Compiling backtrace v0.3.45
Compiling charset v0.1.2
Compiling rand_pcg v0.2.1
Compiling rand_chacha v0.2.2
Compiling env_logger v0.7.1
Compiling publicsuffix v1.5.4
Compiling mio-uds v0.6.7
Compiling scheduled-thread-pool v0.2.3
Compiling brotli2 v0.3.2
Compiling mailparse v0.12.0
Compiling rand v0.7.3
Compiling synstructure v0.12.3
Compiling pest_generator v2.1.3
Compiling r2d2 v0.8.8
Compiling phf_generator v0.8.0
Compiling fake v2.2.0
Compiling phf_codegen v0.8.0
Compiling string_cache_codegen v0.5.1
Compiling selectors v0.22.0
Compiling proc-macro-hack v0.5.11
Compiling serde_derive v1.0.104
Compiling tokio-macros v0.2.5
Compiling pin-project-internal v0.4.8
Compiling derive_more v0.99.3
Compiling actix-macros v0.1.1
Compiling enum-as-inner v0.3.2
Compiling async-trait v0.1.24
Compiling diesel_derives v1.4.1
Compiling cssparser v0.27.2
Compiling v_escape_derive v0.5.6
Compiling cssparser-macros v0.6.0
Compiling html5ever v0.25.1
Compiling actix-web-codegen v0.2.1
Compiling pest_derive v2.1.0
Compiling futures-macro v0.3.4
Compiling phf_macros v0.8.0
Compiling tokio v0.2.13
Compiling failure v0.1.7
Compiling pin-project v0.4.8
Compiling actix-threadpool v0.3.1
Compiling v_escape v0.7.4
Compiling phf v0.8.0
Compiling futures-util v0.3.4
Compiling tokio-util v0.2.0
Compiling tokio-tls v0.3.0
Compiling serde_json v1.0.48
Compiling chrono v0.4.11
Compiling serde_urlencoded v0.6.1
Compiling string_cache v0.8.0
Compiling actix-router v0.2.4
Compiling hashbrown v0.5.0
Compiling foo-common v0.1.0 (/usr/src/foo/common)
Compiling actix-codec v0.2.0
Compiling futures-executor v0.3.4
Compiling actix-service v1.0.5
Compiling h2 v0.2.2
Compiling diesel v1.4.3
Compiling handlebars v2.0.4
Compiling markup5ever v0.10.0
Compiling futures v0.3.4
Compiling hyper v0.13.3
Compiling actix-rt v1.0.0
Compiling trust-dns-proto v0.18.0-alpha.2
Compiling actix-utils v1.0.6
Compiling hyper-tls v0.4.1
Compiling trust-dns-resolver v0.18.0-alpha.2
Compiling migrations_internals v1.4.0
Compiling reqwest v0.10.4
Compiling actix-server v1.0.2
Compiling actix-tls v1.0.0
Compiling actix-connect v1.0.2
Compiling kuchiki v0.8.0
Compiling migrations_macros v1.4.1
Compiling foo-smtp v0.1.0 (/usr/src/foo/smtp)
Compiling actix-testing v1.0.0
Compiling actix-http v1.0.1
Compiling awc v1.0.1
Compiling actix-web v2.0.0
Compiling diesel_migrations v1.4.0
Compiling actix-files v0.2.1
Compiling foo-api v0.1.0 (/usr/src/foo/api)
Finished release [optimized] target(s) in 16m 17s
Removing intermediate container 399c11d14afa
---> 2e25de18eb0c
Step 12/21 : COPY . .
---> 015c6d057f23
Step 13/21 : RUN touch api/src/lib.rs common/src/lib.rs smtp/src/lib.rs
---> Running in 3629ecefd5d6
Removing intermediate container 3629ecefd5d6
---> c867a409822a
Step 14/21 : RUN cd smtp; cargo build --release
---> Running in 5d0d20669796
Compiling syn v1.0.16
Compiling futures-task v0.3.4
Compiling futures-channel v0.3.4
Compiling futures-util v0.3.4
Compiling synstructure v0.12.3
Compiling tokio-macros v0.2.5
Compiling pin-project-internal v0.4.8
Compiling serde_derive v1.0.104
Compiling failure_derive v0.1.7
Compiling failure v0.1.7
Compiling tokio v0.2.13
Compiling tokio-util v0.2.0
Compiling tokio-tls v0.3.0
Compiling h2 v0.2.2
Compiling pin-project v0.4.8
Compiling hyper v0.13.3
Compiling hyper-tls v0.4.1
Compiling serde v1.0.104
Compiling serde_urlencoded v0.6.1
Compiling foo-common v0.1.0 (/usr/src/foo/common)
Compiling reqwest v0.10.4
Compiling foo-smtp v0.1.0 (/usr/src/foo/smtp)
Finished release [optimized] target(s) in 4m 24s
Removing intermediate container 5d0d20669796
---> 2eaad0ba1a65
Step 15/21 : FROM alpine:3.11
---> e7d92cdc71fe
[snip]
Copying the source for some reason causes a rebuild for some of the dependencies.
I've worked to get the set of things done in various stages of our CI pipeline to be as mutually exclusive as possible (update downloads deps, build builds everything, then test runs the test suite). I wonder if the patterns there couldn't be applied here as well. Judicious use of the --frozen flag helps to detect and prevent overlapping work. The --dependencies-only flag may still be of use here though since it seems the goal many want here is to make an image suitable for development.
Our pipeline (using gitlab-ci, would need translation into other CI pipeline dialects): https://gitlab.kitware.com/utils/rust-git-checks/pipelines/164527 (ignore the clippy problems; they showed up just this week and I haven't gotten around to fixing them yet). Basically, I download two sets of dependencies and cache their downloads (one for minimum-versions, the other for a standard cargo update), send these to a set of various build configurations, then those build results go to various test configurations (one set ends up testing against git's master branch, so that one build gets tested twice).
I'll also note that it is possible to mount things into docker build steps (at least via volumes). I do this with podman at least (it can create Docker images and I assume it is mimicking "real" Docker here) to inject sccache into the container to make container rebuilds much faster. (podman also lets me avoid the Docker-in-Docker madness.) It's not a Rust project that I have doing this, but you can see the rebuild script and the associated Docker context.
I wonder if just recommending sccache might not be a better workaround with the tools available today (and would only make --dependencies-only even better anyways).
cargo build --dependencies-only would save lots of time and computing power spent on rebuilding dependencies over and over in dockerized build environments.
Actually it is hard to believe that such a simple (I suppose it requires not more than just one if somewhere in the code) and useful feature has not been implemented for 4 years that passed since the issue got opened.
I suppose it requires not more than just one
ifsomewhere in the code
I've tried to fix it in the past and it's definitely not just a simple if.
related, but it'd be nice to get the .d files to figure out what files are actually used during compilation, without having to build the whole crate.
related, but it'd be nice to get the .d files to figure out what files are actually used during compilation, without having to build the whole crate.
To actually be correct, you at least need build.rs for that. Though it'd be nice if #6031 were fixed first so users could provide more accurate information as well.
I feel like this is a separate issue from this one however.
@NilsIrl I think I have an idea how to solve it.
But first I realised that --dependencies-only name does not really reflect what we want to achieve. Because as I understand all the above comments, the idea is to be able to split the build into 2 stages:
So in case a project has following dependencies:
[dependencies]
atty = "0.2"
cargo-platform = { path = "crates/cargo-platform", version = "0.1.1" }
instead of building _all_ dependencies, cargo should download and build only atty, and all remote dependencies of cargo-platform sub-project. The compilation of cargo-platform dependency itself from sources should be deferred.
So it is not about root package vs. its dependencies but it is more about local vs remote packages.
Based on this understanding, I created initial implementation of the feature. It introduces a new option (which I named --only-remote), and it allows executing the 2 stages in the following way:
cargo build --only-remote -Z unstable-optionscargo buildand as you would expect, the 2nd execution of cargo is much faster, as it reuses all packages downloaded & built in first stage, and it builds only remaining projects i.e. which have sources in filesystem.
The output of the commands executed in root directory of the Cargo project itself, produces following output
$ cargo build --only-remote -Z unstable-options
...
Compiling semver v0.9.0
Compiling rustfix v0.5.0
Compiling git2-curl v0.14.0
Finished dev [unoptimized + debuginfo] target(s) in 2m 10s
$ cargo build
Compiling cargo-platform v0.1.1 (/private/tmp/cargo2/crates/cargo-platform)
Compiling crates-io v0.31.0 (/private/tmp/cargo2/crates/crates-io)
Compiling cargo v0.45.0 (/private/tmp/cargo2)
Finished dev [unoptimized + debuginfo] target(s) in 1m 47s
Dockerfile could look like this
# ------ first stage ------
FROM rust:buster AS dependencies
COPY Cargo.toml /app/
# fooling cargo into believing it has sources
RUN mkdir /app/src/ && touch /app/src/main.rs
WORKDIR /app
RUN cargo build --only-remote -Z unstable-options
# ------ second stage ------
# reusing binaries build in the previous stage
FROM dependencies AS app_builder
# copying all project sources
COPY . /app/
RUN cargo build
I added the option just as a regular package selector along with the existing ones: --workspace, --package, and --exclude
$ cargo build --help
cargo-build
Compile a local package and all of its dependencies
USAGE:
cargo build [OPTIONS]
OPTIONS:
-q, --quiet No output printed to stdout
-p, --package <SPEC>... Package to build (see `cargo help pkgid`)
--all Alias for --workspace (deprecated)
--workspace Build all packages in the workspace
--exclude <SPEC>... Exclude packages from the build
--only-remote Exclude local packages from the build (but include their remote dependencies)
(unstable)
and the option can be combined with all the above options (if combined, it results in logical AND). For instance
cargo build --only-remote -Z unstable-options -p tar
It can be also used with other cargo subcommands: bench, check, doc, check, test.
For now it is just an initial implementation, which resides on my fork (dependencies-only branch). Please consider it as a proposal, and let me know if this is what you expected, and if so I will make a PR out of my changes (i.e. which means adding tests and going through feature submitting procedure). I never contributed to this project, so I don't want to make this effort before I am sure it is needed.
PS. and I am sorry that my previous comment was a bit harsh.
Thank you for doing this!
As a nitpick, "newly-added" read to me as if the option had already been accepted into cargo mainline, and no further work on this issue was necessary. Though I certainly look forward to the possibility of that being the case.
@ohAitch agree. It could be confusing. Corrected.
I brought it to internal forums's attention.
@lguminski Thanks for creating this! I've verified, based on your fork, this does exactly what I need.
I can now build a docker image which contains all compiled dependencies for my project (which takes tens of minutes to build and doesn't change very often). And use this image to build my final project (which now takes just seconds, and happens frequently).
Your change makes this way of working way more straight forward and removes some hacks in our current build process which achieve the same goal.
Is there a PR link? (looking forward to trialing this out in nightly)
Hi. I've implemented --deps-only in my branch in a350501695cac7d78c75f7417e9ff5eb504b4733 .
It's still somewhat rough around the edges, but it seems it works for me in various cases including workspaces and in combination with --tests.
I've just noticed there's another effort by @lguminski , so maybe I shouldn't have done that, but...
So it is not about root package vs. its dependencies but it is more about local vs remote packages.
@lguminski can you explain the reasoning for this a bit more? I'm not convinced we need to separate local vs remote packages. In a company-wide setting, we have a bunch of our own libraries which are used throughout our codebase and they are basically local, since we use one big repo for the majority of code. We probably don't want to rebuild all of them in every CI pass.
@technimad my pleasure. thanks for verifying.
@vojtechkral for small projects, where you have just a root project with sources, and a bunch of remote dependencies, this does not make any difference if you divide across the root project/dependencies or local/remote packages line. Both divisions produce the same result. So that's why I understand it may appear confusing.
But for bigger projects, the difference between the 2 concepts becomes apparent. Cargo itself is such a project. If you look into its Cargo.toml you will find 2 local dependencies
[dependencies]
cargo-platform = { path = "crates/cargo-platform", version = "0.1.1" }
crates-io = { path = "crates/crates-io", version = "0.31" }
if you would just stick to the idea that you want only exclude the root project, then any change in crates/ subdirectory of your sources would result into re-downloading and re-building the entire dependency tree on the build server. And this is not what we want. We want to have the behaviour like follows:
$ cargo build --only-remote -Z unstable-options
...
Compiling semver v0.9.0
Compiling rustfix v0.5.0
Compiling git2-curl v0.14.0
Finished dev [unoptimized + debuginfo] target(s) in 2m 10s
$ cargo build
Compiling cargo-platform v0.1.1 (/private/tmp/cargo2/crates/cargo-platform)
Compiling crates-io v0.31.0 (/private/tmp/cargo2/crates/crates-io)
Compiling cargo v0.45.0 (/private/tmp/cargo2)
Finished dev [unoptimized + debuginfo] target(s) in 1m 47s
and this is what the --only-remote option delivers.
The situation you described at your company suggests, that what the company is doing is basically source-level integration across many teams. This type of configuration management proved to cause many problems in the past and for this reason the industry moved away from this concept, and instead started using local repositories/registries that act as a bridge between teams. If I am right, the lack of this feature is not your company's biggest problem. It needs a local registry, and then it will be able to consume this feature easily. This is the way to go.
@gilescope the PR is ready here
@lguminski
If you look into its Cargo.toml you will find 2 local dependencies
That's not a very typical approach. Typically, if you have a larger project, you split it into multiple crates via a workspace. And in that case, all members of the workspace are root packages.
OTOH if you specify a dependency via path, IMO it is expected that the dependency does not change very frequently compared to the root packages.
In one of our projects, for example, we have a workspace of 5 root crates, and then about ~15 local dependencies outside of the project referenced by path. With your approach, the CI will have to recompile those 15 dependencies each time.
I'm having a hard time seeing the benefit of this. We should have _all_ the dependencies pre-built and only have to rebuild the packages in the workspace. I think you're optimizing for a pretty uncommon case.
If I am right, the lack of this feature is not your company's biggest problem. It needs a local registry, and then it will be able to consume this feature easily. This is the way to go.
I don't very much like the idea of forcing everyone to have to have a local registry.
Please notice that your approach only works well if the company has a local registry, whereas the --deps-only approach, as originally suggested, works well in both cases.
The only case where your approach really works better is when someone splits their project into multiple crates but doesn't use a workspace, which is quite weird, because workspaces were introduced exactly for this reason.
@vojtechkral thanks for the explanation. I understand better your case now.
First of all this solution has been tested with workspaces. Please see integration tests supplied with the PR.
Then regarding:
In one of our projects, for example, we have a workspace of 5 root crates, and then about ~15 local dependencies outside of the project referenced by path. With your approach, the CI will have to recompile those 15 dependencies each time.
let me explain. The idea behind this feature is to split dependencies into 2 groups:
From the perspective of cargo internals, there are 5 locations where a package can come from
enum SourceKind {
/// A git repository.
Git(GitReference),
/// A local path..
Path,
/// A remote registry.
Registry,
/// A local filesystem-based registry.
LocalRegistry,
/// A directory-based registry.
Directory,
}
The current implementation considers any dependency in a local path (Path enum value) as changing frequently, and all the rest as changing infrequently. This includes:
If the 15 dependencies do not change frequently, your project should refer to them via their Git locations, and then cargo with this PR will consider them as changing infrequently. And it will pre-compile them in the first stage (with --only-remote flag provided).
But what I would like to stress is that if we would go for --deps-only approach, then whenever anything changes within your 5 root crates, this would trigger re-downloading and re-compiling all dependencies. And this is against the intention expressed here. So --deps-only is NOT a solution to the problem of docker containers getting rebuilt frequently.
What I would suggest is doing an experiment. Please do the following:
cargo install --root /tmp --git https://github.com/lguminski/cargo --branch dependencies-only
/tmp/bin/cargo behaves when your 15 dependencies are expressed via git repositories[dependencies]
dependency1 = { git = "https://company/dependency" }
But what I would like to stress is that if we would go for
--deps-onlyapproach, then whenever anything changes within your 5 root crates, this would trigger re-downloading and re-compiling all dependencies.
No, it wouldn't. A change in the 5 root crates in the workspace doesn't trigger re-compile of the dependencies with my patch. I've tested this.
Perhaps you meant that a change in one of the ~15 deps linked with path would trigger a full rebuild, which is true. However, they are not expected to change nearly as often compared to the root packages.
The approach with git paths would probably work, but I'm not sure we want to do that. _Edit:_ I'm not sure at this point what the implications are. I think there might be a problem where a change in a dependency would not be picked up by the dependant crate unless you run cargo update.
The ultimate solution would be to have two flags, one for all deps and on for remote-only deps, such as:
--deps
--deps-remote
that way you could create separate docker layers for remote and local dependencies.
But that might be a bit over the top.
Perhaps you meant that a change in one of the ~15 deps linked with path would trigger a full rebuild, which is true. However, they are not expected to change nearly as often compared to the root packages.
I guess it really depends on how you work, I can see people updating local crate quite often
again, cargo probably is such an example
I like the "over the top" option and would suggest --only-dependancies=[all|remote] to avoid confusion in case both options are used
another syntax could be --only=[dependancies|remote-dependancies]
I was thinking about it again recently and came to the conclusion that the name of the option which would be best understood by the Rust community is just --exclude-project-sources. Because it reflects well the idea behind. And this this is how I renamed the option in PR
So now it is
cargo build --exclude-project-sources -Z unstable-options
I hope that you like it.
@lguminski what are the semantics now? Does it build remote dependencies or all non-root dependencies?
I described the option in the documentation as follows:
--exclude-project-sources
Exclude project sources, while keeping _all_ their git and registry dependencies included. This option is designed for building project dependencies separately, and storing their binaries in a Docker image for later use.
Implementation-wise I decided to keep it simple (which means there are no sub-options), because simplicity is what makes it powerful.
The exact behaviour of this option is configurable through project's Cargo.toml files:
Cargo.toml files) via its path or workspace, this tells cargo that its source code belongs to project's sources i.e its contents may change at any time, thus cargo --exclude-project-sources _won't build_ such a dependency.git of a registry, this tells cargo that the dependency is guarded by some sort of a configuration management mechanism (even as simple as just a Cargo.lock file which can be committed into repository), and it won't change very often. As such, cargo --exclude-project-sources _will build_ it, no matter where it sits in the structure of the dependency tree.@vojtechkral I hope it clarifies the use of this option. I can refer to integration tests and source code for more information on the implementation and usage.
@lguminski I'm sorry but I don't think your PR really solves the problem. There are several issues:
path dependencies outside of workspacepath dependency with a git dependency in the same repo won't work, because cargo won't pick up updatespath) crate--exclude-project-sources (this might be quite subjective)Honestly I took a step back and started doubting, if my implementation, although clean and elegant, brings any value. For building Docker images, the original use case I had in mind, it is not any better solution than just copying all Cargo.toml files to a separate folder, creating dummy main.rs and mod.rs files, and running simple cargo build. This solution creates a layer with all dependencies already.
Summing up, I started thinking that this is a good solution without a good use case. It was fun though.
@technimad perhaps you could explain your use case, and tell if this solution brings really any value to you.
@vojtechkral
- The trick with replacing the path dependency with a git dependency in the same repo won't work, because cargo won't pick up updates
And this is exactly how it should be. Your software should not depend directly on a source code outside of your repository (source-level integrations), because then any new version from that dependency gets automatically included. It is easy to imagine a case, when you are about to release your application, and suddenly just before the date of release you realize that your application does not compile anymore, because a change from an underlying library crept in. For this reason configuration management has been invented. It allows controlling what software gets into the final product. Highly recommended.
@lguminski maybe we could cooperate on a solution. I'll do some more tests and try to figure out an approach. Also, I might've not explained our use case right, but that's not important right now... I'll get back to this issue later...
The solution with a dummy src/lib.rs does not work correctly, see https://github.com/rust-lang/cargo/issues/2644#issuecomment-600289241
syn for some reason causing a rebuild for everything depending on it directly or transitively.src/lib.rs with COPY in docker doesn't update the mtime, which breaks cargos build cache. I think docker is important enough for rusts success to prefer a solution in cargo over asking all users to include workarounds in all their Dockerfiles.I think treating a path= dependency as an "extended workspace" is acceptable.
FWIW, I have a half-finished implementation that correctly (to my knowledge) filters the dependency graph, however I'm having some trouble propagating the information correctly to the compile step. I'll report back once I've solved that issue...
@kpcyrd
I see. Then maybe it still makes sense. Could you please do an experiment with this
RUN cargo install --root /tmp --git https://github.com/lguminski/cargo --branch dependencies-only
RUN /tmp/bin/cargo --exclude-project-sources
and tell me if this works as you expect?
I would be totally happy with an MVP. Maybe it won’t cover all the cases, but we can iterate.
Sent with GitHawk
Okay, finally figured how to propagate information around in the right way. I created a PR at #8061
What is included:
--dependencies to build all dependencies, including local ones, basically everything except root packages--remote-dependencies to build all remote dependencies, excluding root packages _and_ local packages.Hopefully these two options should cover everyone's use cases :)
The naming of the options follows the same logic as the already existing option --tests, the command cargo build --tests sounds like an english sentence _"cargo, build tests"_. I have named the options similarly, ie. cargo build --remote-dependencies sounds like _"cargo, build remote dependencies"_. The two options are mutually exclusive.
patch sections are supported, the semantics are as follows: if a remote dependency is patched with a local one (path) and --remote-dependencies is used, the dependency is built _iff_ another remote dependency depends on it. Otherwise it is not built.@lguminski I borrowed a piece of code from you - the is_local() function. I marked the relevant commit with your authorship to preserve credit. If you would like to contribute to the PR as well, I would be happy to provide commit access to my fork. _Edit:_ I went ahead and sent you an invite (please don't do force-pushes, we can squash some commits at the end of review if need be)...
I've tested the code with various projects, both personal and work ones, some of them are non-trivial (workspaces with local and remote dependencies, patches, proc-macros), but of course I doubt I've managed to do enough testing in the short time I had to implement this. There might be bugs. Testing is welcome.
Let me know what you think.
I wanted to offer some perspective and alternative solutions after reading lguminski's doubts on whether this was a useful endeavor at all (particularly around why it seemed so tricky to get right, and initially to argue for closing the issue altogether if no valuable solution could be found).
vojtechkral, your work seems like it alleviates those doubts after all and thus pretty much removes the need for such perspective.
Still, I had captured my (unfinished) thoughts (from before I saw the latest proposal) and now decided to put them in a separate gist instead. If anybody is interested, I believe they're still relevant (but not as important).
@technimad perhaps you could explain your use case, and tell if this solution brings really any value to you.
We build a rust application which is part of a bigger solution. This rust application is relatively small and uses one workspace. Building the dependencies takes a substantial amount of time relative to the compilation of our own code. Our dependencies include for instance a vendored openssl, and we compile for multiple platforms/targets. All our dependencies are remote dependencies (GitHub crates.io).
In our CI running the full build, including dependencies, is not feasible due to the long build times. We've mitigated this by building a Docker image which contains all precompiled dependencies. This is now done by coping our Cargo.lock and Cargo.toml to a dummy project and running cargo build for each target in the docker file.
In our CI we run a compile script in a container based on this image, this script runs cargo build for each target and moves the compiled binaries to a dir where they can be picked up by our CI.
Your solution would make our docker file more straight forward and maintainable. Same goes for the compilation script, which now includes some specific code to make the compilation idempotent (https://github.com/rust-lang/cargo/issues/2644#issuecomment-526931209), in case we need to run the script twice (not normal CI operation).
I hope this gives some perspective.
In our CI running the full build, including dependencies, is not feasible due to the long build times.
FYI, this sounds like it could also be solved by sccache (or similar). Though I don't know how to use it effectively on cloud-provided CI infrastructure. We found great benefits from it, but we're running local-hosted gitlab-ci runners where we can do that easily with a local Redis server too.
I like the discussion here.
@t-nelis this is very good. Almost like the initial version of a section from Cargo Guide. As a user I would like to see a guide with best practices for building rust binaries with Docker. And I started thinking along the same lines as you
This may be why this feature is difficult to get right: it might not make sense to try to "fix" it on the cargo side. Instead, one might look to more modern features of OCI image builders that rectify this misstep.
@technimad thanks for the perspective. You are the right person to judge if solutions that appear hear are useful.
@vojtechkral thank you for giving me credits and inviting me to working on your changes,. Unfortunately I am afraid it is too early for me to put work into that. Your PR currently lacks documentation and testing, which would convince me that the problem you are solving is a generic problem of Rust community, and not just a specific problem of your company.
I think this all discussion boils down to a question:
_what is the problem we want to solve?_
Two days ago I started asking myself question: is my solution any better than
cargo vendor and I closed my PR to give myself and the reviewer time to think about it.
Summing up, I think here is the right group of people to come up with the definition of the use case. And once we agreed upon a use case, it will be easy to judge solutions on the table.
And now I am stepping out of the discussion for a while, as I don't want to enforce my opinions. I will be less active here for 2-3 weeks to give you a chance to speak up.
I mostly skimmed this thread, so sorry if it was already mentioned, but I'd like this for a different reason: it helps when trying to measure and possibly optimize the build times of a project. In that scenario, I'm not interested in building the dependencies, and at best they can pollute the measurements (e.g. make a mess of -Zself-profile).
@t-nelis, the gist, offers really interesting perspective I would not have thought of myself.
Still, I think a solution in (vanilla) cargo would be much useful. Plain old Dockerfiles remain most portable image building method (think of Google Cloud Build, GitLab shared runners, ...), at least until modern alternatives proliferate.
I believe that this would overcome one of the Rust adoption road-blocks also in company I work in: people are used to less-than-minute CI build times, and CI minutes are not cheap. The alternatives:
sccache/mount local dir into image being build/novel image builders would require extra configuration/work and would be a minus point for introducing additional complexity;dummy main.rs/lib.rs/cargo-build-deps almost work, but still seem as work-arounds and have their own limitations/complexities.To me, the additional considerations like should path dependencies be included or not? or single dependency update causes full rebuild[1] are not really important -- as @gilescope puts it, a MVP would be already good enough for me. Maybe good enough for the 99% -- and it can be always iterated, modulo command line options[2].
To put it more high-level: Rust's unique powers come also from the (expensive) compilation model, are all dependencies are compiled (every time) along with the resulting binary. Locally, this is mitigated elegantly using cargo cache. In CI environments, this could be mitigated elegantly using cargo build --(remote-)dep(endencie)s(-only).
[1] If a crate "deep" in the tree is updated, a lot of crates would probably need to be rebuilt anyway; also, the root crates built in Docker images would be usually binary, where it is recommended to track Cargo.lock in git, thus dependency updates are explicit (and probably grouped anyway).
[2] semantics of released command line options probably cannot be changed, so I see defining good ones as a most crucial part of the PRs.
Your PR currently lacks documentation and testing, which would convince me that the problem you are solving is a generic problem of Rust community, and not just a specific problem of your company.
I'll be adding tests & docs ASAP as time allows. I tried to design the features as generic as I can (I believe it's a superset of the previous PRs, correct me if I'm wrong). My effort was indeed partly motivated by some CI problems we face in company I work for, but my work on this is _not_ on behalf of that company and is _not_ paid for, I'm doing this in my free time with the hope that the result would be useful to everyone.
In our CI running the full build, including dependencies, is not feasible due to the long build times.
FYI, this sounds like it could also be solved by
sccache(or similar). Though I don't know how to use it effectively on cloud-provided CI infrastructure. We found great benefits from it, but we're running local-hosted gitlab-ci runners where we can do that easily with a local Redis server too.
This is probably true, but it means we have to add yet another component to our infrastructure. Which is probably not supported natively by our cloud CI (bitbucket).
We havent gone into this path because both of these reasons.
We build a solution in which one component is build in rust. To keep things as simple as possible we try to limit the use of other tools and stick to tooling native to the main components. In this light the --dependencies-only (or a variant with an other name) function would be a great fit.
FYI, this sounds like it could also be solved by sccache (or similar).
sscache could speed things up at build time, but there’s still some gain having a --deps-only flag, if docker can reuse the same layer instead of building a new one, that means we’ll have shared layers across images, thus less storage used and lower bandwidth needed when pushing / pulling images
and sscache (or --mount=type=cache) can still be used on both build steps to speed thing up when a dependency changes for example
We're using sscache in production and while it does speed up the build, it is way slower than a local build (with already compiled dependencies). With sscache, cargo still thinks it needs to build all dependencies which consumes some wall clock time (even if they are in sscache), sscache itself says it does not implement all compiler options, some calls are "non-cacheable" etc.
I'm looking forward to the --dependencies-only or similar feature.
I've added a bunch of tests & some basic docs in my PR.
I've also pushed my changes onto crates.io under a fork name vkcargo so that you can test my implementation easily if you want
Edit: I've closed my PR since it turns out the problem is not solved with the --dependencies (or variants thereof) flag.
With some of the insight from the comments above as well as a few iterations of PRs I think it's apparent that a --deps-only (or similar) flag doesn't solve the docker cache problem. Even with the flag implemented we still need a way to get a skeleton project into the image. Once that is done, the --deps-only flag doesn't really help since you might as well just run cargo build. The crux of the docker pre-built cache problem is in getting a skeleton of a project into an image, which is a different issue.
@vojtechkral What do you mean "getting a skeleton of a project into an image"?
It's entirely usual that a Dockerfile copies the dependency files into the build image individually, prior to other project files. So I imagine it would look like:
# Dependencies
COPY Cargo.toml Cargo.lock /build/dir/
RUN cargo build --deps-only
# Full build
COPY . /build/dir/
RUN cargo build --release
In fact, I'm currently using a variation of this already, and it's working nicely except that it's very clunky without --deps-only (https://github.com/intgr/ego/blob/master/varia/Dockerfile.tests#L13-L18).
@vojtechkral it has been mentioned a few times that --deps-only should work with a skeleton that's created with:
COPY Cargo.toml Cargo.lock ./
Creating a skeleton that's sufficient for cargo build causes mtime issues down the road. This becomes obvious when you try to implement the approach you're suggesting.
@intgr @kpcyrd None of that seems like a general solution, it won't work with workspaces for example. I'm also not sure about build.rs. Also, this is problematic in terms of cargo implementation details, since cargo build assumes resolving the manifest files (ie. looking up sources).
we still need a way to get a skeleton project into the image
this comment describes how it would work: https://github.com/rust-lang/cargo/issues/2644#issuecomment-335258680
@kpcyrd Can't the mtime issue be solved with touch? As for the syn problem, I don't understand the problem there...
I think this bug should be closed because the only use case anyone has put forth for --deps-only is already perfectly solved by docker buildkit's cache volumes [1], and the container building system is the appropriate layer in which to cache container builds.
To everyone who's continued to post asking for this feature - if you don't think buildkit cache volumes are sufficient, could you please give a detailed example of where that fails? At least then we'll understand the problem we're still trying to solve.
[1] I posted the first complete, actually-working solution using this technique upthread https://github.com/rust-lang/cargo/issues/2644#issuecomment-570749508
PHP's Composer allows me to just copy composer.json and composer.lock and have composer install be cached by Docker.
JavaScript's NPM allows me to just copy package.json and package-lock.json and have npm ci be cached by Docker.
No Buildkit, nothing, it just works and anyone would naturally expect.
But with Cargo for some reason we need to either use Buildkit and configure it in some way or do tricks with empty main.rs/lib.rs after which issues appear that are solved with touch such that Cargo can detect changes to source code.
I'm running things in CI (Jenkins), I'm not even sure its Docker integration works with Buildkit. What I do know is that both other languages I have to work with don't have this issue.
Saying that issue doesn't exist because of possible Buildkit usage is not fair IMO. I believe Cargo should be able to download and compile just dependencies and nothing else. I really don't understand why you still question usefulness of such a basic feature after many years of people requesting it :disappointed:
after many years of people requesting it
Well, I was in the camp of wanting Cargo to have this hack (while always believing this was the wrong layer to do it in), until the recent advent buildkit solved the problem in full generality. Now I don't see the issue, and, no, I don't think "I want to avoid spending 10 minutes in a configuration file" is enough of a use-case to justify adding this complex machinery to Cargo which will have to be maintained forever once people start relying on it to cache their builds. Especially because with a --dependencies-only flag, you'll still be spending those ten minutes in your build file.
Why do I think this is a hack? I think this is a hack because the definition of what this flag should do is just, "build whatever we want Docker's default caching system to cache". The problem is not defined in a principled way; it's just a hack to work around how docker's default build caching system works. Actually, even if this flag existed today and did the vaguely-defined thing that you want it to do, it would still be inferior to the cache volume solution. The reason is that docker's caching layers are all-or-nothing - either nothing changes, or the whole layer gets rebuilt. With cache volumes, cargo can rebuild just crates that have been updated at a finer level of granularity. And, cache volumes can be shared between users and build images, even network mounted to be shared between devs, but none of those options are available to cache layers.
All we want this for is caching, and caching is already solved in a better way and with zero code. All anyone has to do to get me to change my mind is to post a single intelligible example of something that --deps-only would solve but cache volumes don't solve.
One of my use cases is using cargo inside Nix, which does not have a
volatile build cache "escape hatch" due to determinism concerns.
It is not an issue of "whatever docker wants to cache"; it's an issue of
getting cargo to produce filesystem state not dependent on the source files
in the project directory, as a phase run before cargo gets access to said
source files so as to make decisions using them.
On Mon, Apr 13, 2020 at 11:02, Nazar Mokrynskyi notifications@github.com
wrote:
PHP's Composer allows me to just copy composer.json and composer.lock and
have composer install be cached by Docker.
JavaScript's NPM allows me to just copy package.json and package-lock.json
and have npm ci be cached by Docker.
No Buildkit, nothing, it just works and anyone would naturally expect.But with Cargo for some reason we need to either use Buildkit and
configure it in some way or do tricks with empty main.rs/lib.rs after
which issues appear that are solved with touch such that Cargo can detect
changes to source code.I'm running things in CI (Jenkins), I'm not even sure its integration
works with Buildkit. What I do know is that both other languages I have to
work with don't have this issue.Saying that issue doesn't exist because of possible Buildkit usage is not
fair IMO. I believe Cargo should be able to download and compile just
dependencies and nothing else. I really don't understand why you still
question usefulness of such a basic feature after many years of people
requesting it 😞—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-613017372,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAOFPBRH4OAF3A5AQ6HJ6HTRMNHUXANCNFSM4CC5D3KA
.
I would tentatively support --deps-only erroring outright if it _does_ find a src/main.rs src/lib.rs etc, for example.
@nazar-pc
Saying that issue doesn't exist because of possible Buildkit usage is not fair IMO. I believe Cargo should be able to download and compile just dependencies and nothing else. I really don't understand why you still question usefulness of such a basic feature after many years of people requesting it
Because it's not as simple and it's not clear that the --deps-only option is the solution.
Can you or someone else describe how you would use the --deps-only option to build a workspace containing 10 sub-packages in Docker?
it's not clear that the
--deps-onlyoption is the solution
How is that not clear? It allows to make use of the layer caching docker has.
Can you or someone else describe how you would use the --deps-only option to build a workspace containing 10 sub-packages in Docker?
https://github.com/rust-lang/cargo/issues/2644#issuecomment-335258680
The linked comment does not address workspaces, and I expect at minimum you
will also want to copy in the Cargo.toml in all the subdirectories. When I
ran into a similar problem with node and lerna, I eventually gave up and
COPY'd in the whole tree, then used find to delete everything not named
package{,-lock}.json - not pretty but it worked.
On Mon, Apr 13, 2020 at 11:40, Nils notifications@github.com wrote:
it's not clear that the --deps-only option is the solution
How is that not clear
Can you or someone else describe how you would use the --deps-only option
to build a workspace containing 10 sub-packages in Docker?2644 (comment)
https://github.com/rust-lang/cargo/issues/2644#issuecomment-335258680
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/cargo/issues/2644#issuecomment-613034472,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAOFPBXSZW4MUOSOCVEXCQDRMNMC7ANCNFSM4CC5D3KA
.
it's not clear that the
--deps-onlyoption is the solutionHow is that not clear? It allows to make use of the layer caching docker has.
Yes, but only when you manage to copy a skeleton of the workspace.
Can you or someone else describe how you would use the --deps-only option to build a workspace containing 10 sub-packages in Docker?
That copies a single Cargo.toml file doesn't it? In a workspace with 10 sub-packages, you need to copy 11 Cargo.toml files.
Though now that I look at the docker syntax again, COPY Cargo.* */Cargo.* . may be sufficient.
@ohAitch I'm afraid that command copies all the Cargo.toml files in the same destination (I tried).
Let me just try to list here some of the complexities you can run into while trying to create a pre-cached layer of dependencies and which are (also) the reason this problem is quite difficult to solve:
main.rs).Cargo.toml files in an arbitrarily-complex directory structure.path dependencies where the dependencies may be in the same directory or nearby, but may also be somewhere completely else and may not be considered part of the immediate project.patch sections; with patched dependencies it may happen that a package from crates.io depends on a local package specified with path.Cargo.toml file in a docker image may render a cached deps build useless if later on a config file comes along and changes build parameters.... that list is probably incomplete and there might be other gotchas.
To everyone who's continued to post asking for this feature - if you don't think buildkit cache volumes are sufficient, _could you please give a detailed example of where that fails_? At least then we'll understand the problem we're still trying to solve.
@masonk you’re talking about RUN --mount=type=cache right ? it’s not an issue for me but it can be for some people: this is an experimental feature. and, obviously, it relies on building with builkit only.
in my case where I think it fails is because :
having said that, I think it would be a good idea to use that with --dep-only to avoid rebuilding everything when parts of the dependency tree change (if you’re lucky enough to have a docker daemon running in experimental mode and able to set DOCKER_BUILDKIT=1).
ps: same goes for sscache, it’s nice to use it on top of --deps-only and buildkit cache but it’s no replacement
@mathroc
In order to get me to change my mind, I need an example of something that cache volumes can't solve but that --deps-only can solve. I think your bullet points are pointing out real drawbacks of cache volumes, but it isn't clear to me how a hypothetical --deps-only - defined to do exactly the most useful thing for Docker build caching - would fix any of them.
if you have a multiple machine building containers, it becomes cumbersome to configure cache sharing
The story for distributed caching is strictly better for volumes than layers. Multiple machines building containers can't share layers at all. Or, maybe they can through a third party wrapper, but that's a huge amount of config, too. It doesn't work out of the box - I'm pretty sure about that.
depending on where your builder are hosted you might not have the option to use this feature
How is that different from needing nightly cargo? I expect it would be harder to build caching through cargo than through buildkit for a while. Buildkit is not really experimental anymore, as it's been around for years, is widely supported, is the officially blessed way forward for all of Docker, is shipped with standard Docker distros, and will soon be enabled by default for most distros.
if you do use this feature, it’s not necessarily easy to configure it properly (think about concurrency or different build configurations (features, architecture, etc.?)
Cache layers have absolutely nothing they can do to avoid invalidating the entire layer when a single file used as input is touched in any way. Cache volumes never perform worse than that and often perform much better.
Concurrent builds - yes it could be a problem, as I think Cargo just acquires a lockfile while building, but it's nothing that cache layers solve. They just "solve" the issue by never sharing work between machines. It's easy to do that for cache volumes, and is the default behavior anyway.
the build will then make an entirely new image layer, most likely a big one, and that will cost a lot of storage and/or bandwidth.
Hmm. This might be an actual thing. I believe that it would only create a new layer this in cases where a cache layer would have been rebuilt, anyway, but I'm only about 50% sure of that.
it will still not be as quick as docker detecting that the layer already exists in cache
Why not? Cargo and Docker have to do essentially the same op when scanning for cache invalidation. If there's a time difference at all, it's negligible, order of ms.
Following this issue is very tiresome, we already discussed the workspace topic, patch sections and path dependencies.
.rs filesbuild.rs is not considered a dependencyPlease stop pushing for BUILDKIT until docker actually starts to enable it by default. It's not relevant for a majority of docker users yet and I think it's offtopic, as well as the discussion about build clusters and sscache.
One thing that the buildkit cache volume can't solve, though I don't know how much of an issue it really is, is the benefits of the cache layers at runtime in something like Kubernetes... Like, if I have 5 apps that share some set of dependencies around 100MB in a 6 node cluster, and I want to update them, that's 100MB per application per node that needs to be pulled down (about 3GB). The application itself might only be a few MB, which would be considerably less. It's probably pretty rare to have 5 apps sharing the exact same set of dependencies though (and if they don't then they wouldn't share the cache anyway unless there were multiple "dependency" cache layers).
And I see now that this was already mentioned..
Following this issue is very tiresome
Couldn't agree more...
we already discussed the workspace topic, patch sections and path dependencies.
Yes, but a viable solution wasn't proposed as far I know (although the thread is long).
* Feature resolution doesn't rely on `.rs` filesThat's true.
* `build.rs` is not considered a dependency
build.rscan influence compile flags / linking. I don't know if it this might be an issue wrt. dependencies, the link options probably don't affect building of dependencies, but I'm not sure. Edit: The-Loption might...
* I don't see how configuration files have anything to do with this specificallyIMO if a configuration file changes some compiler flags, the cached dependencies will be considered out of date since the compiler flags won't match.
@masonk
In order to get me to change my mind, I need an example of something that cache volumes can't solve but that
--deps-only_can_ solve. I think your bullet points are pointing out real drawbacks of cache volumes, but it isn't clear to me how a hypothetical--deps-only- defined to do exactly the most useful thing for Docker build caching - would fix any of them.if you have a multiple machine building containers, it becomes cumbersome to configure cache sharing
The story for distributed caching is strictly better for volumes than layers. Multiple machines building containers can't share layers _at all_. Or, maybe they can through a third party wrapper, but that's a huge amount of config, too. It doesn't work out of the box - I'm pretty sure about that.
I don’t understand how it’s better with volumes, with volume layers you just need --cache-from and a registry (that you probably already have if you’re to the point where you have multiple build machines)
depending on where your builder are hosted you might not have the option to use this feature
How is that different from needing nightly cargo? I expect it would be harder to build caching through cargo than through buildkit for a while. Buildkit is not really experimental anymore, as it's been around for years, is widely supported, is the officially blessed way forward for all of Docker, is shipped with standard Docker distros, and will soon be enabled by default for most distros.
it’s different because it’s often easier to have rust nightly inside a build container than being able to tinker with the docker daemon configuration and/or build process
if you do use this feature, it’s not necessarily easy to configure it properly (think about concurrency or different build configurations (features, architecture, etc.?)
Cache layers have absolutely nothing they can do to avoid invalidating the entire layer when a single file used as input is touched in any way. Cache volumes never perform worse than that and often perform much better.
--only-deps would only be useful when there is no changes, true. using --only-deps does not prevent to use volume cache for when there is a cache invalidation.
Concurrent builds - yes it could be a problem, as I think Cargo just acquires a lockfile while building, but it's nothing that cache layers solve. They just "solve" the issue by never sharing work between machines. It's easy to do that for cache volumes, and is the default behavior anyway.
it’s not sharing work across machine that’s an issue, concurrent build could happen on the same machine as well
the build will then make an entirely new image layer, most likely a big one, and that will cost a lot of storage and/or bandwidth.
Hmm. This might be an actual thing. I believe that it would only create a new layer this in cases where a cache layer would have been rebuilt, anyway, but I'm only about 50% sure of that.
I’m a 100% sure :)
it will still not be as quick as docker detecting that the layer already exists in cache
Why not? Cargo and Docker have to do essentially the same op when scanning for cache invalidation. If there's a time difference at all, it's negligible, order of ms.
docker will only have to see that the Cargo.toml and Cargo.lock files are unchanged. ie: the last dockerfile commands are cached too), it won’t look into anything else at all.
no matter how many dependencies (and workspaces supposing those are handled too) docker won’t have anything to scan
here is an exemple of how it looks like:

the full line 12 is :
RUN --mount=type=cache,id=yarn-$APP-build,target=/usr/local/share/.cache/yarn \
yarn install --frozen-lockfile --no-progress --non-interactive
the same lines appears at line 14 because this is a multi stage build and it detected that the same layer from the builder-prod can be reused
you can notice that I’m also using volume cache for when the lockfile changes
when the whole image is already in cache it takes 0.4 second to build from scratch:

otoh I understand very well that the cargo works might make it difficult to implement. but there is
That copies a single Cargo.toml file doesn't it? In a workspace with 10 sub-packages, you need to copy 11 Cargo.toml files.
Please stop pushing for BUILDKIT until docker actually starts to enable it by default. It's not relevant for a majority of docker users yet and I think it's offtopic, as well as the discussion about build clusters and sscache.
I'm afraid that command copies all the Cargo.toml files in the same destination
My post might get buried again, but did anyone have any comments for the workaround (when at least needing to use docker) I detailed in https://github.com/rust-lang/cargo/issues/2644#issuecomment-544820077 . It simply caches cargo files via multistage build layers, no BUILDKIT or advanced volume juggling required.
One could even break up the /tmp folder into /tmp{1,2,3,...} to cluster packages into breakable docker cache layers, depending on the degree of volatility of the upstream packages in your workspace to help save more build time for just the downstream packages in active development.
I feel like a 3rd party rust cli could help shuffle the skeleton of the workspace around so one doesn't need to lean on sh. In other other build tools, like colcon in the ROS2 ecosystem, one can specify --packages-up-to to get a list of upstream package dependencies in the current workspace with respect to a specific package. That could then be used to programmatically filter the workspace into a corresponding hierarchy of docker image layers.
That said, having something like cargo build --dependencies-only that didn't need to check for any main.rs or lib.rs files could help simplify the last RUN directive in the cache stage.
# Filter or glob files to cache upon
RUN mkdir ./cache && cd ./ws && \
find ./ -name "Cargo.*" -o \
-name "lib.rs" -o -name "main.rs" | \
xargs cp --parents -t ../cache
@ruffsl I don’t quite ‘get’ it. How does the caching work? I would think as soon as someone changes a source file, docker would invalidate that cache layer as all source files were loaded into it.
On the buildkit caching front I have ~0 probability that docker will hit the same machine. Only by caching in an base image layer do I get reuse but that’s good enough (my docker is spelled kaniko, which does have its own way of caching but whether I can get to it is another question)
https://github.com/nacardin/cargo-build-deps looks promising...
Maybe we should all help with cargo-build-deps and continue the discussion over there?
Sent with GitHawk
https://github.com/nacardin/cargo-build-deps looks promising...
I tried it and it doesn't work.
And besides, it still solves the wrong problem. It still involves manually recreating the workspace skeleton in the Dockerfile, which is cumbersome at best and not feasible for more complex projects.
The problem is not building dependencies but realiably and automatically recreating a cacheable workspace skeleton, taking into account all cargo features & complexities.
This issue should just be closed, it's going nowhere.
For me personally, the buildkit cache trick works, but there is now a big documentation issue: So many Dockerfile examples on the web still use the old hacky way with fake lib.rs etc.
My searches led me to this issue, but I didn't bother searching through hundreds of comments to find out the better solution.
It's been said already:
PHP's Composer allows me to just copy composer.json and composer.lock and have composer install be cached by Docker.
JavaScript's NPM allows me to just copy package.json and package-lock.json and have npm ci be cached by Docker.
Rust not having something palatable for a standard docker layer cache solution makes it a less good tool for many real world cases.
This issue should just be closed, it's going nowhere.
If you don't see this as a problem worth thinking about, then maybe not partake in the discussion?
Maybe we can solve the simpler problem first: single project builds maybe also just those without path dependencies. Maybe we can find some 80/20 solution that solves most cases, but not all of them?
If you don't see this as a problem worth thinking about, then maybe not partake in the discussion?
See _what_ as a problem, that's the question. This issue is about the --deps-only option. The problem you talk about is integrating cargo in docker builds efficiently. The --deps-only option is presented here as a solution, which is pretty confusing, because it's at best a second step and it might not actually even be necessary (in case a skeleton is generated in a clever way with perhaps adjusted mtimes etc.).
A new issue with explicitly a topic of docker integration and not necessarily involving --deps-only would make more sense to me.
Maybe we can solve the simpler problem first: single project builds maybe also just those without path dependencies. Maybe we can find some 80/20 solution that solves most cases, but not all of them?
Perhaps. If you're inclined that way, feel free to go ahead and implement something like that. I am personally not motivated to do that because a solution that only works in simple cases wouldn't solve problems for me. Edit: And I expect maintainers would be a lot less interested in a special-case solution too...
@ruffsl I don’t quite ‘get’ it. How does the caching work? I would think as soon as someone changes a source file, docker would invalidate that cache layer as all source files were loaded into it.
@gilescope , the approach in https://github.com/rust-lang/cargo/issues/2644#issuecomment-544820077 only busts the docker build cache layer that cargo fetchs/builds the dependencies if any of the Cargo .toml or .lock files change; any other source file changes would not break that particular layer in the build stage of the multistage Dockerfile.
Your still correct in that any source file change not ignored by .dockerignore would break docker build caches given the scope of the COPY ./ ./ directive, but only for the cache stage of the multistage Dockerfile. The cache stage is where one can invoke arbitrary logic on what should or shouldn't be used to bust cache layers in later stages based on how you'd like to shuffle and prioritized files.
In my example approach, the cache stage is where I make a copy the workspace using the same parent directory hierarchy, but that only includes Cargo.* files, and substitute any main.rs or lib.rs files with skeleton place holders. This skeleton workspace is what is first copied from the cache stage into the build stage, and will be byte-for-byte the same in terms of file content (docker seems to ignore file metadata like timestamps) as long as no files used in the making the skeleton workspace are changed (or added or removed).
Later in the build stage, the rest of the source files are copied over, and is where other source files changes would end up breaking the docker build cache as desired. This can of course be customized by daisy-chaining or parallelizing stages to how ever complex you'd want your multistage build to be.
Related: https://github.com/moby/moby/issues/15858#issuecomment-614450800
https://github.com/nacardin/cargo-build-deps is not working because of Cargo.lock parsing, so I created this repository https://github.com/errmac-v/cargo-build-dependencies based on cargo-build-deps. I think it would be useful until there is no general solution.
I'de like to give my perspective.
One thing that draws people to rust is the whole integrated ecosystem around the language. Installing a cross compiler toolchain used to be a dependency nightmare between compilers, libraries and scripts. It now is a simple rustup command.
So you want unit tests? Find a fitting framework, include it in your development pipeline, teach the developers to use this specific framework... A whole kludge. With rust, it comes included cargo test
A trade-off in the language design is compilation takes longer in favour of no-cost abstractions and guarantees. So at some point any rust developer will run into the need to optimise their build times.
Within this rich integrated ecosystem, where lots of used-to-be hard problems are taken care of. And in which a significant portion of build time spent on building dependencies it almost feels natural and expected that something like cargo build --dependencies-only exists.
I now understand, by following this thread, that there are quite some corner cases and different explantations of what a 'dependency' is and what such flag should do. Yet such flag should exist, because it fit's the expectation of the ecosystem. And because of the language design any project will, at some point, look at and try to optimise their compilation times.
The alternative presented in this thread, create a dummy project and copy cargo.toml and .lock over, is exactly the cludge we used to have and which shouldn't exist in the rust ecosystem.
The flag should cover most common use cases (only external, versioned, dependencies) and have very precise and specific wording in what the flag does. This to make it clear which cases are and aren't covered. And also signal future maintainers what the intention is and prevent feature creep in the future.
@technimad
And in which a significant portion of build time spent on building dependencies it almost feels natural and expected that something like
cargo build --dependencies-onlyexists.
What do you need the flag for? What would be the point? Let me reiterate a key point here which I originally also didn't realize and implemented --dependencies-only in vain:
--dependencies-only flag does NOT solve the problem with dependencies build caching in Docker images... unless it only needs the project's skeleton / metadata like in @errmac-v 's implementation.
If you run cargo build --dependencies-only in Docker on the full project, the resulting build, even if only containing dependencies, will be tied to the exact state of the project's sources and will be trashed as soon as the sources change. This is why you need a dummy project / skeleton - you need a representation of the project independent of the current state of the sources.
The alternative presented in this thread, create a dummy project and copy cargo.toml and .lock over, is exactly the cludge we used to have and which shouldn't exist in the rust ecosystem.
Maybe, but there's nothing that cargo can do about this, it's just the way docker works. Without some change in docker, you could only alleviate this by having a more automated way of generating the skeleton / metadata, but that's about it.
Well, it would still let us implement the cludge without all the problems associated with having to remove the correct fingerprint files, partial bulds, etc.
I can't count how many times I've set up a build pipeline only to have the production image just print Hello world and exit because there was some fingerprint file I forgot to delete. This has also broken because of Rust upgrades where new files were being emitted but not cleaned up by my build steps.
Right now you need to
What do you need the flag for? What would be the point? Let me reiterate a key point here which I originally also didn't realize and implemented --dependencies-only in vain:
I gave an example earlier in the thread: I'd like this for profiling the compilation times of projects.
@Mange
3. Build test and release builds, to make sure all dependencies are downloaded, delete the fingerprints for release and debug builds without removing the compiled dependencies, delete the partial build files for the debug and release files, and make sure this works even after Rust/Cargo is upgraded, without really having a way to test it. (Brittle and hard)
I've also ran into this and I believe this is an issue with docker whereby docker doesn't update the mtime to what the source file has and instead leaves mtime from a cached layer, which confuses cargo. Right now I just use touch to update mtimes of all source files, which seems easy enough. IMO the best solution would be to have an option to update mtime in docker.
I feel like docker's shortcomings are constantly being shifted to cargo here, with demands to fix them by hacking up things in cargo.
@lnicola
I gave an example earlier in the thread: I'd like this for profiling the compilation times of projects.
Okay, but in that case wouldn't it be better to have an option to output profiling info directly? That would probably entail a lot more useful detail compared to running time cargo build --dependencies and subtracting that from time cargo build ...
I've also ran into this and I believe this is an issue with docker whereby docker doesn't update the mtime to what the source file has and instead leaves mtime from a cached layer, which confuses cargo.
I think the actual issue is that I wrote the source files a few minutes ago, while the generated project file was a few seconds ago. Ergo the Hello World file is newer and an older version is copied to the image.
Right now I just use touch to update mtimes of all source files, which seems easy enough. IMO the best solution would be to have an option to update mtime in docker.
It hurts when doing partial compilation, though. I usually want to share images between dev and release builds by using the same "dependencies are built" layers. So even when doing local "compiled on every save" I expect to be able to mount the correct directory and have things work.
Which they do, unless you consider someone checking out the code for the first time and then compiling the dev build. Then the dummy becomes newer and the project is not recompiled.
I feel like docker's shortcomings are constantly being shifted to cargo here, with demands to fix them by hacking up things in cargo.
While I really don't disagree with this sentiment, I also understand why people expect things to behave this way since Cargo is the black sheep. Almost every other kind of project separates dependencies from the project itself: Ruby, JS, C, C++, Python, and so on all have a "install dependencies" step and then you compile/build/test the project.
This is probably why Docker was even designed they way it was. And sure, Cargo has no obligation to work well with Docker, but the thought that we're sooo close to it working like most other langagues (e.g. "work with Docker's workflow") really makes you wish for a simple step to get rid of the hacks.
What if cargo build could work without a src dir? Then we won't need to run cargo new and we only have to copy the Cargo.toml file and build. No mtime issues either. This would also bring us even closer to some other languages' flow.
(Assuming that if you have something more exotic in Cargo.toml, you'll have to set it up anyway)
I've also ran into this and I believe this is an issue with docker whereby docker doesn't update the mtime to what the source file has and instead leaves mtime from a cached layer, which confuses cargo.
I think the actual issue is that I wrote the source files a few minutes ago, while the generated project file was a few seconds ago. Ergo the Hello World file is newer and an older version is copied to the image.
I thought so too, the first thing I tried was setting the timestamp to an artificial old date when the skeleton was generated. But it solved nothing. I think the issue is that docker's COPY command updates the file contents but will re-use some weird cached mtime if the file is already there from the previous layer. Maybe I don't understand docker enough, not sure.
And sure, Cargo has no obligation to work well with Docker, but the thought that we're sooo close to it working like most other langagues (e.g. "work with Docker's workflow") really makes you wish for a simple step to get rid of the hacks.
I don't understand what makes you think we're close, we're not. Solving this in cargo is hard both design-wise and implementation-wise (I know because unlike those downvoters I tried). Especially if you're going for a general solution, which I believe is the only approach that could actually be merged into cargo. Why ask for an 80% semi-hacky solution, which can't be accepted into cargo? That sort of thing belongs in a 3rd party crate...
[...] but the thought that we're sooo close to it working like most other langagues [...]
I don't understand what makes you think we're close, we're not.
I meant from the end-user's perspective. We're "only" missing a single command to cover the majority (I assume) of cases - small binary apps for use internally at companies or in people's homes.
Why ask for an 80% semi-hacky solution, which can't be accepted into cargo?
I'm not asking for a hack. I've been patient here for a long time now and would prefer a solid solution.
I just wanted to adress the seeming confusion on why so many people want a command to build all dependencies. It's very expected from a lot of developers coming from this side of the ecosystem; and developers coming from embedded, driver dev, GUI, etc. might not see the perspective here.
I'm not assuming your background, but at least I can share mine to perhaps help with the understanding.
I'm also a fan of being pragmatic. Rust 1.0 was not perfect but it was shipped anyway because it was better to release something than to stay in 0.x forever. I like that perspective. If we could solve a problem for 90% of cases and then work towards the other 10% if still needed, I think that's better than not doing anything. I also think that most of the exceptional cases here don't even assume or need this capability. If you have a monorepo or build a while bunch of crates from the same codebase, then you are probably unlikely about to package it as a Docker container to ship it to production in the same way as you build it for dev.
Right now we have people that seem to want to build dependencies because they are working on smaller projects where setting up custom CI software, or cache layers for Docker, or custom storage drivers is just too much to ask. I don't see comments from large teams with infra teams that work only with Rust and can afford to invest time in building their own CI services, so right now we don't have enough information on how to design for them. I don't think it's misguided to design a solution for the people with the problem and letting the others be a known unknown until more information is received.
No matter what, I appreciate that you are thinking about this problem and raising very valid objections. We can best avoid the XY problem if we keep asking questions. I don't think we should blindly accept solution X because people ask for solution X; better we describe problem Y and solve it with either X or Z.
I think you are mostly right; there's a risk that the solution doesn't cover 90%, but only like 40% while being a large foot-gun for the other 60%. We don't want that shipped anywhere near Cargo.
@Mange Ok, thanks for writing that and for being patient here.
It's very expected from a lot of developers coming from this side of the ecosystem; and developers coming from embedded, driver dev, GUI, etc. might not see the perspective here.
Well, that's true, but also I think it's that, as far as I know, tools like _npm_ or _composer_ are simpler in principle compared to cargo, which is why people may not realize that the request to add a --dependencies-only option to cargo requires a lot more consideration compared to those tools. At least that's my impression...
We can best avoid the XY problem if we keep asking questions. I don't think we should blindly accept solution X because people ask for solution X; better we describe problem Y and solve it with either X or Z.
Yeah. Asking for a not-perfect-but-still-viable solution is fine and the point about Rust 1.0 is valid, but it is really unclear to me where the line should be drawn, ie. what should (not) be supported and in particular how to do that such that the user is not confused (having a build flag that works 80% of the time may be pretty confusing).
I'm thinking that maybe a more straightforward approach would be to have a command or option to serialize the bulid of dependencies into a file (which could be checked in VCS), kind of like Cargo.lock, but for building dependencies, ie. there would be more information (such as build flags etc). You could then copy a single file (even for a large workspace or so) and run cargo based on that. Cargo already has an (ustable) option to output build plan, this could essentially be a subset of the build plan. _Edit:_ Or maybe Cargo.lock could be enriched to contain enough information to run a dependencies build.
I'm thinking that maybe a more straightforward approach would be to have a command or option to serialize the bulid of dependencies into a file (which could be checked in VCS), kind of like Cargo.lock, but for building dependencies, ie. there would be more information (such as build flags etc). You could then copy a single file (even for a large workspace or so) and run cargo based on that. Cargo already has an (ustable) option to output build plan, this could essentially be a subset of the build plan.
I was about to suggest something similar - I this this could be a good way to go. Even if build plan would not be tracked in VCS, it is possible to generate it cheaply thanks to Docker staged builds (i.e. a preparatory stage could be used to generate it - it would run fast, so people could copy entire source tree into it).
Edit: Or maybe Cargo.lock could be enriched to contain enough information to run a dependencies build.
That would be the best option from Docker PoV, but I fear it would be controversial for everybody else: it would cause a lot (more) of information from Cargo.toml is duplicated into Cargo.lock (feature flags, build profiles, ...). For projects with VCS-tracked Cargo.lock, people would have to remember check it in after feature flag changes etc (a small change of workflow).
I've been lurking on this thread for quite a while, and might have a helpful tidbit for those looking for a docker workaround. I did a scan and couldn't see anywhere this has been suggested, but the following seems to work for caching deps for me in a docker build:
# Create a fake main file
RUN echo "fn main() {println!(\"if you see this, the build broke\")}" > src/main.rs
# copy over your manifests
COPY ./Cargo.toml ./Cargo.toml
# this build step will cache your dependencies
RUN cargo build --release
COPY ./src ./src
COPY build.rs build.rs
RUN cargo build --release
RUN cp target/release/pepsi /usr/local/cargo/bin/pepsi
Pepsi is just the codename for a project of mine, and build.rs is essentially a single println in a main function. For some reason the build.rs is necessary in order to do an actual cache bust for the binary.
Anyway, not looking to comment either way, just thought I'd leave this here for docker folks looking for a quick workaround!
@trezm I have been playing with the same docker commands but ran into some strange thing where it is not rebuilding the binary. Does it have to do with timestamps of the main.rs file? Why did the build rs fix the problem?
@camerondavison do be 100% honest, I'm really not sure why the build.rs affects the result! I came across it as a solution by accident. I had a project that I have a dockerfile for that uses tonic, so I needed a build.rs to build proto files.
I noticed that I could super simplify the Dockerfile for that project, but when I copied the same Dockerfile to a new project (sans build.rs) it no longer worked because of what you mentioned. I added the build.rs file back with a single dummy println and it seemed to work again!
Really not sure why this would change anything, maybe something to do with the presence of a build.rs file forcing cargo to look at the actual timestamp on all the files? This is me wildly speculating, but I encourage you to try adding a dummy build.rs and see if you get the same results!
@trezm adding a build.rs file with
fn main() {}
does fix the problem. so crazy. I am going to keep trying to figure out why.
EDIT: found a solution
because the file that is created is newer than the source code rust thinks that the build artifacts that get created are newer than the source code. I was able to fix this problem by touching the build artifacts for just my source code before doing the next build. not sure if that is better than using the build.rs or not. this is the dockerfile I ended up with
FROM rust:1.44 as build
# app
ENV app=my-app
# dependencies
WORKDIR /tmp/${app}
COPY Cargo.toml Cargo.lock ./
# compile dependencies
RUN set -x\
&& mkdir -p src\
&& echo "fn main() {println!(\"broken\")}" > src/main.rs\
&& cargo build --release
# copy source and rebuild
COPY src/ src/
RUN set -x\
&& find target/release -type f -name "$(echo "${app}" | tr '-' '_')*" -exec touch -t 200001010000 {} +\
&& cargo build --release
@camerondavison adding to your work-around, you can replace pre-last line with: find target/release/ -type f -executable -maxdepth 1 -delete
hey @trezm @camerondavison doesn't work if there are multiple workspaces. 😢
Bummer! Does @vn971's solution work? I don't claim to have any substantial knowledge about cargo, I was just hopefully helping with a workaround :)
@anitnilay20 just copy the needed Cargo.* files only, and otherwise repeat the steps? Should work?
@trezm Solution does work for single workspace cargo projects but it doesn't work for multiple workspaces..
Let's better raise a StackOverflow question or something. This ticket is to add support for the flag. I think multiple workspaces work if done correctly, in the right order.
@vn971 it worked.. issue was bad Dockerfile. anyway thanx for your help.
Does anyone have a working version that can cache deps for workspaces too?
@dessalines You can check this https://github.com/sha-el/Vouch/blob/master/Dockerfile
@anitnilay20 Thank you so much, that did it!
It does re-fetch the workspace deps when anything changes in there, but any changes in the main project don't re-fetch the workspace deps!
I have been dealing with the issue over a year now, both in libs, bins, and workspaces, with projects with and without examples and benches, mostly in docker images. Here are my 2 cents:
The underlying consequence is the too high compilation time of slowly-varying code in the context of CI/CD. Specifically, there is a number of rust projects whose majority of their CI/CD time is spent compiling non-changing code, which can be cached (e.g. check logs of arrow/rust or its issues within docker/CI).
The core issue is that, in the context of containerization (e.g. docker), engines have no way of knowing that the resulting artifacts in target/*/deps are independent of src/ after RUN cargo build. As such, they must invalidate all artifacts in target/*/deps when src/ changes (e.g. via a COPY . . before RUN). This is the reason why cargo build is not sufficient: we need a command to
build the stuff in
target/*/depsusing the dependants oftarget/*/deps
Cargo supports this locally by not recompiling dependencies that are already compiled. However, it does this because it knows that target/*/deps is independent of src/; a generic technology like containerization has no way of knowing this.
In theory, solving this issue requires two different API endpoints:
Cargo.toml, src/[lib.rs/main.rs] and optionally build.rs)target/*/depsIMO this is issue a sub-part of one of the top 5 requested improvements to rust - compilation time: slow compilation is not only addressed by producing better plans for the LLVM, but also providing the tools to efficiently cache slowly varying code, such as external dependencies.
Finally, why the current status quo and "replacing lib.rs/main.rs for a dummy" is not enough:
This approach works well for simple projects, but does not work for projects with examples or benches. Essentially, given a Cargo.toml with a [[bench]]/[[example]], cargo build [--lib] [--release] requires the respective file benches/examples to be there. This implies that we also need to copy the benches and examples (which causes a recompilation on every change of example), or parse the Cargo.toml and touch a file for every [[bench]]/[[example]] on it (high maintenance).
When there are inter-workspace dependencies, say w1 <- w2, we can't compile w2's dependencies because it requires compiling w1 dependencies, like @shepmaster mentioned.
One current "solution" for this is to repeat the step for non-workspace projects on each workspace including examples/benches (cumbersome).
In the end, a command like cargo build --dependencies-only would collect the necessary files required to build targets/*/deps based on Cargo.tomls and would execute the download/compile to achieve that final state, in the same way that the hacks aforementioned currently achieve with different levels of "hackiness".
@jorgecarleitao
The underlying consequence is the too high compilation time of slowly-varying code in the context of CI/CD
Why you can not use sccache for that?
It also automatically handle changes of versions of dependencies, compiler version changes and so on.
@jorgecarleitao
The underlying consequence is the too high compilation time of slowly-varying code in the context of CI/CD
Why you can not use sccache for that?
It also automatically handle changes of versions of dependencies, compiler version changes and so on.
We have:
This is complex, and doesn't seem to be helping IRL, so I'm closing this
@jorgecarleitao
Why you can not use sccache for that?
It also automatically handle changes of versions of dependencies, compiler version changes and so on.
In our case we don’t go for sccache because:
Our product consists of multiple components written in different languages. We support this with a small team, so we stick to native tooling and defaults as much as possible.
for those who have the benefit of a buildkit based docker CI process... this is how i've been able to achieve < 1 minute test builds and < 2 minute release builds
For Testing:
# define dependencies for temporary build
COPY src/Cargo.toml src/Cargo.toml
COPY test/Cargo.toml test/Cargo.toml
COPY Cargo.lock Cargo.lock
COPY Cargo.toml Cargo.toml
# dummy files so we can compile and build depenencies
RUN echo "fn main(){}" > src/lib.rs
RUN echo "fn main(){}" > test/lib.rs
# cache dependency compilation
RUN --mount=type=cache,target=/root/.cargo/registry/ RUSTFLAGS="-C target-feature=-crt-static" cargo test --package test
# remove dummy files and compilation cache (not dependency cache)
RUN rm -rf src test
RUN rm -rf target/debug/**/libsrc*
RUN rm -rf target/debug/**/libtest*
# copy actual src over that tends to change more often than dependencies
COPY build/cdn build/cdn
COPY test test
COPY src src
# run another build using our actual source code
RUN RUSTFLAGS="-C target-feature=-crt-static" cargo test --package test
For release/deployment
# dummy files so we can compile and build depenencies
RUN mkdir src && echo "fn main(){}" > src/lib.rs
RUN echo "[workspace]" > Cargo.toml && echo "members = [\"src\"] " >> Cargo.toml
# define dependencies for temporary build
COPY src/Cargo.toml src/Cargo.toml
COPY Cargo.lock src/Cargo.lock
# cache dependency compilation
RUN --mount=type=cache,target=/root/.cargo/registry/ RUSTFLAGS="-C target-feature=-crt-static" cargo build --release --lib
# remove dummy files and compilation cache (not dependency cache)
RUN rm -rf src
RUN rm -rf target/release/**/libsrc*
# copy actual src over that tends to change more often than dependencies
COPY build/cdn build/cdn
COPY src src
# run another build using our actual source code
RUN RUSTFLAGS="-C target-feature=-crt-static" cargo build --release
FROM release as bin1
WORKDIR /rust
COPY src/cfg src/cfg
COPY --from=release /rust/target/release/bin1 bin1
ENTRYPOINT ["/rust/bin1"]
FROM release as bin2
WORKDIR /rust
COPY src/cfg src/cfg
COPY --from=release /rust/target/release/bin2 bin2
ENTRYPOINT ["/rust/bin2"]
Having lost more of my life than I care to mention trying to get non-trivial incremental compilation working in something docker-like, I have come to the conclusion that only by removing mtime from the picture can we achieve stable caching. I think we need https://github.com/rust-lang/cargo/issues/6529 to be solved before there's any point in having a command that builds just the dependencies (for which we have reasonable workarounds). Currently there's no way to work around the mtime dependency.
I figured out how to get this also working with cargo workspaces, using romac's fork of cargo-build-deps.
This example has my_app, and two workspaces: utils and db.
FROM rust:nightly as rust
# Cache deps
WORKDIR /app
RUN sudo chown -R rust:rust .
RUN USER=root cargo new myapp
# Install cache-deps
RUN cargo install --git https://github.com/romac/cargo-build-deps.git
WORKDIR /app/myapp
RUN mkdir -p db/src/ utils/src/
# Copy the Cargo tomls
COPY myapp/Cargo.toml myapp/Cargo.lock ./
COPY myapp/db/Cargo.toml ./db/
COPY myapp/utils/Cargo.toml ./utils/
# Cache the deps
RUN cargo build-deps
# Copy the src folders
COPY myapp/src ./src/
COPY myapp/db/src ./db/src/
COPY myapp/utils/src/ ./utils/src/
# Build for debug
RUN cargo build
In python there is a file called dependencies.txt, So in your dockerfile you do something like this:
WORKDIR /my-stuff
COPY dev.dependencies.txt dependencies.txt ./
# in dependencies i declared fastapi that install uvicorn
RUN pip install -r dev.dependencies.txt -r dependencies.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080", "--reload"]
So you dont have to download all dependencies everytime you changes the code. With Rust Im unable to generate compilation cache and build without creating a dummy project. It would be great something like:
WORKDIR /my-stuff
COPY Cargo.toml Cargo.lock ./
# this runs cargo fetch, cargo install, and finally cargo build
RUN cargo cache build
COPY . .
RUN cargo build
CMD ["cargo", "watch", "-x", "run"]
Then cargo cache can contain clean and build subcommands (or generate, or something that describes the action of building the cache)
Look at https://github.com/rust-lang/cargo/issues/5120
@uselessscat This is why I created #8362 , I believe that's more or less what you mean...
As mentioned in #8362, I believe cargo-chef is a MVP of what this could look like from an API perspective.
A sample Dockerfile looks like this:
FROM rust as planner
WORKDIR app
RUN cargo install cargo-chef
COPY . .
RUN cargo chef prepare --recipe-path recipe.json
FROM rust as cacher
WORKDIR app
RUN cargo install cargo-chef
COPY --from=planner /app/recipe.json recipe.json
RUN cargo chef cook --release --recipe-path recipe.json
FROM rust as builder
WORKDIR app
COPY . .
COPY --from=cacher /app/target target
RUN cargo build --release --bin app
FROM rust as runtime
WORKDIR app
COPY --from=builder /app/target/release/app /usr/local/bin
ENTRYPOINT ["./usr/local/bin/app"]
It does not require Buildkit.
I still believe this functionality should somehow find its way into cargo, but hopefully we can explore the space a bit with cargo-chef outside of it to understand the tradeoffs and actual usage patterns.
@LukeMathWalker this is much cleaner to have the recipe planning inside docker as well... you should make this the default example in the cargo-chef readme.
I thought I did, but it turns I had not pushed the commit to remote :sweat_smile: Updated! @dessalines
Most helpful comment
I ended up here because I also am thinking about the Docker case. To do a good docker build I want to:
This means the dependencies would be cached between docker builds as long as
Cargo.tomlandCargo.lockdoesn't change.I understand
src/lib.rssrc/main.rsare needed to do a good build, but maybebuild-depssimply builds _all_ the deps.