Cargo: Support for pre-built dependencies

Created on 9 Jan 2015  路  26Comments  路  Source: rust-lang/cargo

Currently you can add dependencies using path or git. Cargo assumes this is a location to source code, which it will then proceed to build.

My use-case stems from integrating Cargo into a private build and dependency management system. I need to be able to tell Cargo to only worry about building the current package. That is, I will tell it where the other already-built libraries are.

Consider two projects: a (lib) and b (bin) such that b depends on a:

[package]

name = "b"
version = "0.0.1"
authors = ["me <[email protected]>"]

[dependencies.a]

path = "/tmp/rust-crates/a"

A clean build will output something like:

> cargo build -v                                                                                                                                                                                                                                               master untracked
   Compiling a v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /tmp/rust-crates/a/src/lib.rs --crate-name a --crate-type lib -g -C metadata=10d34ebdfa7a5b84 -C extra-filename=-10d34ebdfa7a5b84 --out-dir /private/tmp/rust-crates/b/target/deps --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target/deps -L dependency=/private/tmp/rust-crates/b/target/deps`
/tmp/rust-crates/a/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/tmp/rust-crates/a/src/lib.rs:1 fn it_works() {
/tmp/rust-crates/a/src/lib.rs:2     println!("a works");
/tmp/rust-crates/a/src/lib.rs:3 }
   Compiling b v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /private/tmp/rust-crates/b/src/lib.rs --crate-name b --crate-type lib -g -C metadata=429959f67e51bc23 -C extra-filename=-429959f67e51bc23 --out-dir /private/tmp/rust-crates/b/target --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target -L dependency=/private/tmp/rust-crates/b/target/deps --extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib`
/private/tmp/rust-crates/b/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/private/tmp/rust-crates/b/src/lib.rs:1 fn it_works() {
/private/tmp/rust-crates/b/src/lib.rs:2     println!("b works");
/private/tmp/rust-crates/b/src/lib.rs:3 }

Importantly:

--extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib

Would it make sense to expose a extern option (in dependencies.a) for low level customization?

[dependencies.a]

extern = "/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib"

This can be worked around by using a build script along the lines of:

use std::io::fs;

fn main() {
    let from = Path::new("/tmp/rust-crates/a/target/liba-b2092cdbfc1953bd.rlib");
    let to = Path::new("/tmp/rust-crates/b/blah/liba-b2092cdbfc1953bd.rlib");
    fs::copy(&from, &to).unwrap();
    println!("cargo:rustc-flags=-L /tmp/rust-crates/b/blah");
}

But it is not ideal to have to do this with every project.

Most helpful comment

This is also what Nix wants to do. There won't be a compiler mismatch for packages built with Nix, because Nix will use the compiler as a build input to the package.

All 26 comments

This is a Rust problem even more than a Cargo problem. You can't guarantee that a pre-built Rust library will work unless it's built with the exact same SHA of the compiler.

Yes unfortunately this would require changes to rustc itself, so it's unable to be tackled at this time.

The specific restriction I'm referring to is that you're basically limited to only working with binaries generated by the _exact_ revision of the compiler you're using, as well as the _exact_ same set of dependencies.

Is there something I can read/follow that explains the issues relating to why this requirement is so strict? Is this expected to change over time?

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature? Even something like letting the build script emit --extern as part of the whitelisted flags that cargo:rustc-flags can configure would help me out (assuming that is easier to implement than another top-level dependency option).

/cc @aturon: I had a chat with Steve on IRC about this and he suggested getting your input.

One big reason I want to avoid building dependencies over and over is that our build system rebuilds consumers - in my example, a change to a would trigger a rebuild of b (including tests) and if b failed, the new version of a would not be released. The implication of this is that when building b, a would be rebuilt a second time. This becomes really wasteful.

I'm happy to contribute the change required to implement this if the team feels it is a worthwhile feature. I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

I actually meant @alexcrichton not @aturon :)

Is there something I can read/follow that explains the issues relating to why this requirement is so strict?

Unfortunately no :(. We don't have a ton of documentation in this area, just a bunch of cargo-culted knowledge. In general though this is largely because of two primary reasons (that I can think of):

  1. The ABI for a library is not stable between compilations, even when theoretical ABI-compatible modifications are made.
  2. The metadata format for libraries, while extensible, is not currently use in an extensible way as it regularly breaks backwards compatibility.

Is this expected to change over time?

Certainly! We probably won't invest too much energy into it before 1.0, but I'd love to see progress in this area!

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature?

I suppose it depends on how much cargo integration you want. In your example you gave in the second comment, the manifest probably says that b depends on a, in which case cargo will already pass --extern for a when it compiles b. Cargo would not only just have to forward your --extern flags, but it would _also_ have to know to turn off its own --extern. Additionally it would then have to cut a out entirely from the dependency graph.

In principle allowing --extern from rustc-flags would be possible, but it may have surprising results!

I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

I agree this would definitely be bad! I'd like to hone in on what's going on here first though.

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

I suppose my other questions would be based on that answer, so I'll hold off for that :)

Thanks for the detailed answer Alex!

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

Imagine a is built by Travis. It outputs liba, documentation and so forth - a collection of artifacts. People mostly discard these artifacts in practice, but you might imagine a system where those artifacts are retained. I'm sure this is not conceptually dissimilar to what your built bots do - you get some named and versioned output that you can later use either for development (ala rustup) of yet more projects or for deployment purposes.

To integrate with this build system, one only needs to implement the simple contract: provide something that can be executed that will produce build artifacts. This is just a one-line shell script that turns around and calls cargo build, and we're done.

Along comes project b. It starts off the same as a until we decide to use some of the functionality that a provides. In cargo, you just add the dependency to the manifest - name and version. This build system works in the same way so we add it to it's manifest too. The build system uses this manifest to provide the build artifacts of a for b at compile time. Now the only thing left to do is adjust the path attribute under [dependencies.a] to point to the build artifacts ($A_ARTIFACTS/src, if you will) and we're golden.

(We now have the dependency declared in two places. We can either live with the duplication, or adjust our build script to copy them from one into the other.)

However, we've just hit the first real problem: b needs the _source code_ of a to compile but this doesn't really fit in with the concept of a build artifact. We can cheat by adjusting our shell script to also copy the src of a into it's build directory.

Hopefully, at this point, I've answered the latter question: the intention _is_ to use cargo to build libraries such as a or binaries such as b. The reasons:

  • it does pretty much everything we'd otherwise have to implement (w.r.t. rustc)
  • Cargo.toml is nicer than other interfaces like make for customization
  • it's good to keep things similar to the way the rest of the world does it
  • it makes importing third-party projects easier

The question then, is what happens when a changes? At a high level, the system tracks dependencies, rebuilds them according to the graph and fails if something in the build breaks. This means that b would be rebuilt against the new artifacts of a. If we change our shell script to also execute cargo test, this means that b gets a chance to veto the build as a whole if the change breaks it in some way.

And this brings us to the second problem. a is being built twice. If c depends on b, then the build will build a three times and b twice. This becomes incredibly wasteful pretty quickly. In the context of this specific system, it is also redundant because any changes to a will trigger b to be rebuilt; whereas in the "normal" world, b will be rebuilt if a changes but _only when b is explicitly built_.

As I mentioned in the issue overview, I can work around this by using a cargo build script that adds a -L option to rustc, provided I just dump the libs output by the builds of all of it's (recursive) dependencies. This works and completely solves my problem. Incidentally, it removes the need for the other changes to the build script (don't need to copy source code in, don't need to declare dependencies in Cargo.toml).

But then we hit the problem of ABI compatibility, for which it sounds like there is no solution yet. This means I'll need to find a way of (effectively) adding the rustc SHA to the tuple that identifies an artifact (similar to disambiguating 32/64 bit builds). Or just going with the aforementioned option of building all dependencies for each consumer.

A question you might also ask is: "would hosting a crates.io mirror help with this?". It doesn't. Not because it doesn't "work", but because it only meets some of the requirements (such as private code, not having direct dependencies on external sources for security reasons).

One huge benefit we get out of a single extensible build system is that adding a dependency on a Rust package is no different to adding a dependency on a C, Ruby, Java, Python or Haskell package - they're just named and versioned artifacts. A big use case for me is going to be enabled by that: for example, authoring Rubygems in Rust to speed up performance-critical code paths.

I hope this detail makes my initial question more clear: cargo does things I'd otherwise have to implement myself, but it also does things I'd like to skip. Specifically, I'd like to be able to use something like path for exact control over where the dependency lives, but I don't want cargo to try build it.

FWIW, I'm probably going to go with:

  • include source code in build artifacts
  • rebuild dependencies for each consumer when that consumer is built
  • the build system's build script should copy in Rust dependencies from the build system's manifest to Cargo.toml and specify the path

Alex points out that http://doc.crates.io/build-script.html#overriding-build-scripts could be extended to support overriding rust crates. We could then generate .cargo/config files on the fly.

Alright, after reading that over (thanks for taking the time to write it up!) it sounds like what we discussed on IRC is the best way to move forward with this. Specifically I'd be thinking of something like:

# .cargo/config
[target.$triple.rust.foo]
libs = ["path/to/libfoo.rlib", "path/to/libfoo.so"]
dep_dirs = [ ... ]

Note that the current overrides (target.$triple.$lib) I think may want to be renamed to target.$triple.native.$lib to give us some more leeway. When cargo detects this form of override, however, it will not build libfoo but instead just pass --extern foo=... to the paths listed and -L dependency=... to all of the values in dep_dirs.

One problem I can forsee, however, is that you mentioned about not wanting to share the source code between projects. Cargo would still need the source code, however, to read data such as the Cargo.toml. Cargo doesn't actually need the entire source code base, but it'll need at least that much.

Does that sound like what would work for you?

bump?

Please support this, it's frustrating that it doesn't work yet.
I have to prebuild the ring crate because on the server where I don't have root the GCC version is too old to build ring, so I build it on a server where I'm root and copy it over.
Is there any way right now to use the prebuilt rlib instead of compiling it from crates.io? Maybe with a build.rs script?

Nominated for discussion at the Cargo team meeting.

Was progress made on this at the meeting?

I created DHL as a workaround for this issue.

I don't know about "pre-built" dependencies, but it'd be nice to be able to build a variety of leaf crates without building the dependency crates more than once, if the leaf crates request the same features from the dependencies.

Bump?

Probably not: still nominated..

I'm still a bit in the dark about what happened here. Was anything discussed at the team meeting?

TL;DR Would just like to add another +1 for support for pre-built binaries. It would be great to get a follow up on what was discussed at the meeting.

Motivation Story

Last night we ran a workshop on the nannou creative coding framework. Seeing as nannou supports audio, graphics, lasers, etc along with quite a high-level API in a cross-platform manner, it has a lot of dependencies. It took between 5 minutes and 25 minutes (depending on the user's machine) for users just to build nannou and all of its dependencies for the first time in order for us to begin working through the examples together. Ideally in the future we would write a build script that attempted to first fetch pre-built dependencies before falling back to building from the src. It seems like the feature described within this issue would help to simplify this.

What about some global cache on disk for both the downloaded source code the built binaries, so that if the compiler and crate version match it can avoid a rebuild? Similar to yarn.

This is also what Nix wants to do. There won't be a compiler mismatch for packages built with Nix, because Nix will use the compiler as a build input to the package.

Bump. Does anybody know what's happening here?

I would be interested to know what's happening here. Our company has some products in Rust and we've been working on a new build system. If you use Docker to do your builds, it's actually a good way to get around this problem, because you can do a cargo build during the container build process to cache the built dependencies. However, we also have to support building on Windows and macOS and you obviously can't get those in a container. This issue makes it difficult to have good build times in a build environment that makes use of on demand slaves.

@Twey I have a working solution for Nix, see https://github.com/rust-lang/cargo/pull/7079#issuecomment-508163585 for some context. Hopefully the patch in that PR can come in handy for other build systems.

Could this potentially be done by including the rustc version in the library format? I've created a pre-RFC about it.

How to fix ABIs across versions of rustc... That would be this issue then:

https://github.com/rust-lang/rfcs/issues/600

Elsewhere I also saw some discussions about if / how to implement binary packages. Well for that aspect there is already an open standard:

https://theupdateframework.com/adoptions/

At the bottom of that ^^ page there is a link to an implementation in Rust

Which is still being developed. However it may something of interest to people here.

Was this page helpful?
0 / 5 - 0 ratings