Cargo: Offline use with dependencies

Created on 11 Dec 2014  路  26Comments  路  Source: rust-lang/cargo

It seems cargo needs to be able to update the package list from github anyways regardless of whether they are locally available (via path variables in .cargo/config files) and aborts building a project if github could not be reached to synchronize the package list. This seriously restricts the usefulness of cargo when offline.

Most helpful comment

@najamelan I think your analysis is correct. If there鈥檚 already an up-to-date Cargo.lock file and everything referenced there is in cache, Cargo can work without connectivity. Whenever it needs to create or update a Cargo.lock file, Cargo will update its copy of the registry in order to have the latest and greatest versions.

I think this is where Cargo can be improved: working with an out-of-date registry is often a fine fallback for when connectivity is unavailable.

All 26 comments

After a build cargo will generate a Cargo.lock which should prevent cargo from ever touching the network again after that. Do you have a Cargo.lock and cargo is still fetching remote data?

@alexcrichton So, is the recommended procedure that we commit a Cargo.lock and distribute that with our source tree for machines that do not have network access during a build?

This scenario came up during the meeting with the RelEng folks in Portland. After the sources are staged to the machines, the build machines themselves do not have network access and should not be attempting to touch it.

is the recommended procedure that we commit a Cargo.lock and distribute that with our source tree for machines that do not have network access during a build?

In general, 'application's should have a Cargo.lock checked in.

With https://github.com/rust-lang/cargo/issues/1063 implemented, then I would imagine the process would look like:

  1. Configure all output to be placed in the project directory
  2. On a machine with a network, execute cargo fetch (using the lockfile)
  3. Upload the project directory to a builder with no internet access for the build
  4. Build the project

Step 4 should not use the network because all dependencies have been fetched and they're all stored locally. @larsbergstrom does that sound like it'd be an ok process for RelEng? I just want to make sure that it's ok to have a step where dependencies are downloaded.

@alexcrichton I believe that will work for the two scenarios that they seemed to care the most about:
1) releng builders without internet access
2) being able to send a "full source distribution" to partners that does not require any internet access to build - this is how FFOS is supposedly sometimes delivered to partners building devices, who may not have internet access behind their corporate firewalls

Thanks!

@alexcrichton: As for having Cargo.lock: Yes. That file was in my directory IIRC. But I since included what I needed from collect-rs into my project to make it compile again offline so it's not as easy anymore to test how cargo handles this project with dependencies while not having an internet connection. Obviously, this is not desirable, but I saw no other way that day and I just didn't want to stop coding.

I hope "external" but locally available dependencies (accessable via path entries in .cargo/config files) will also work offline. So, ideally, one would be able to work on several projects locally that might depend on each other without cargo ever requiring internet access.

I think the holes around this have been plugged. If you have a Cargo.lock and continue to hit the network, please let me know!

@alexcrichton: Just letting you know that with cargo-0.3.0-nightly (14e8ed9 2015-06-15) I cannot build clippy because I'm on vacation (so unless I connect via mobile, I'm effectively offline). Granted, I could lock all dependencies to the local cache, but that's tedious. An --offline switch would have been quite helpful.

Is it recommended to commit the entire Cargo registry structure? It seems as though there is a lot of extra metadata in there, but I'm not sure how much Cargo needs.

@alexcrichton actually, it looks like I am running into this.

I have my repository where I've run CARGO_HOME=.cargo cargo fetch and then committed the entire .cargo tree. In my build environment (Gentoo portage), I move the .cargo tree into a temporary location and then run CARGO_HOME="${WORKDIR}/.cargo" cargo build. It then fails with:

Updating registry `https://github.com/rust-lang/crates.io-index`
failed to fetch `https://github.com/rust-lang/crates.io-index`

Caused by:
  [12] SSL error: error:140E0114:SSL routines:SSL_shutdown:uninitialized

@crawford it looks like something wasn't persisted in the cache as Cargo determined that it needed to download something.

@alexcrichton is there any way to get very verbose output from Cargo (short of strace)? I'm not 100% certain that my paths are correct. The man page says "Cargo can be instructed to use a .cargo subdirectory in a different location by setting the CARGO_HOME environment variable.", but I've had to explicitly include ".cargo" as part of my path.

You can try crawling through RUST_LOG=debug but there's probably a lot of output from that.

Well, the manifest path is correct.

DEBUG:cargo::build: executing; cmd=cargo-build; args=["cargo", "build", "--release", "--verbose", "--target=x86_64-pc-linux-gnu"]
DEBUG:cargo::ops::cargo_compile: compile; manifest-path=/build/amd64-usr/var/tmp/portage/coreos-base/coreos-metadata-0.1.0/work/coreos-metadata-0.1.0/Cargo.toml
DEBUG:cargo::ops::cargo_compile: loaded package; package=coreos-metadata v0.0.0 (file:///build/amd64-usr/var/tmp/portage/coreos-base/coreos-metadata-0.1.0/work/coreos-metadata-0.1.0)
DEBUG:cargo::ops::cargo_compile: loaded config; configs={}
DEBUG:cargo::core::resolver: activating coreos-metadata v0.0.0 (file:///build/amd64-usr/var/tmp/portage/coreos-base/coreos-metadata-0.1.0/work/coreos-metadata-0.1.0)
DEBUG:cargo::core::registry: load/missing  registry https://github.com/rust-lang/crates.io-index
DEBUG:cargo: handle_error; err=CliError { error: ChainedError { error: failed to fetch `https://github.com/rust-lang/crates.io-index`, cause: Error { klass: 12, message: "SSL error: error:140E0114:SSL routines:SSL_shutdown:uninitialized" } }, unknown: true, exit_code: 101 }

CARGO_HOME is set to:

/build/amd64-usr/var/tmp/portage/coreos-base/coreos-metadata-0.1.0/work/.cargo

Do you have a set of steps I can do to reproduce? Sounds like there may be quite a few pieces of infrastructure in play, so getting a set of steps to reproduce may help finding where the bug is.

Sure.

  1. git clone https://github.com/coreos/coreos-metadata.git
  2. cd coreos-metadata
  3. rm -rf .cargo (we are going to fetch it ourselves)
  4. CARGO_HOME=/tmp/.cargo cargo fetch
  5. CARGO_HOME=/tmp/.cargo cargo build

Then it updates the registry. In the log, I see a bunch of this:

DEBUG:cargo::core::registry: load/missing  registry https://github.com/rust-lang/crates.io-index

When I ran those steps locally everything worked as expected, so is there perhaps something else at play here?

Strange. I ran this again this morning and now it works. I've noticed that if there is a long wait between fetching and building, it will fail. When I looked at the git diff of my local registry I see:

changed:       .cargo/registry/index/github.com-0a35038f75765ae4 (new commits)

Once, I have these changes, the build works. What is that hash appended to "github.com-"? Is there any other time-sensitive information encoded in the registry? (i.e. how does the registry know it is up to date?)

@alexcrichton ah, interesting. Git was messing me up. The registry index is actually empty in the repo, but git told me everything was up-to-date (because it's a nested git repo). If I _actually_ add the index, it builds. I should have checked that first.

Good to know it's working at least!

Yes, thanks for your time.

I just want to report this is still a problem. I am a bit new to rust, and don't have internet at home. I needed some dependencies, so I added them to one crate, compiled. It downloads everything. I even went into the .cargo directory in order to call cargo doc because that also needs internet, compiled the crates, ran the examples, thinking that I would be fine. I go home add those new dependencies to a different crate, compile and ouch internet needed. This is pretty confusing. Until now I have been using "*" as version because I would be happy to have the latest version of my dependencies most of the time, maybe that's not smart? But even when specifying the exact versions I had on disc it didn't work, and even specifying them as { path = "/home/.cargo/...." } to hardcode their paths to the local version didn't help, even though I can compile and run examples in those crates.

So I just stopped working till I could make it to an internet access...

It would be nice for less privileged users if rust was easier to use offline.

As a suggestion, using ruby I just do "gem install xxx" or "gem update" when online and then "ruby" as a programming language is no longer dependent on the internet, and I can generate the docs with yard offline as well. That's a model that works for offline development.

I just figured out something: cargo wants internet as soon as a dependency changes in Cargo.toml regardless of whether it's available locally or not. What's changing in Cargo.lock is the dependencies = [...] array. I think it's fine that cargo checks whether there is a newer version of a crate, but failing that should compile anyways if everything is available locally. And in reasonable time. Or there could be an offline switch to only try internet if we really have to.

@najamelan I think your analysis is correct. If there鈥檚 already an up-to-date Cargo.lock file and everything referenced there is in cache, Cargo can work without connectivity. Whenever it needs to create or update a Cargo.lock file, Cargo will update its copy of the registry in order to have the latest and greatest versions.

I think this is where Cargo can be improved: working with an out-of-date registry is often a fine fallback for when connectivity is unavailable.

@SimonSapin I'm even thinking that the usefulness of updating the registry probably depends on the TOML file. If a TOML file specifies the versions of dependencies, it's probably not useful to call home if that specific version is already available offline. If however the TOML file specifies dependencies as "*", maybe an update is a good idea?

I also see an argument for only updating anything if the user runs cargo update, cargo fetch or dependencies are missing locally. The point being that it's probably a good idea for developers to be conscious about changing dependency versions, so as to make sure to re-run unit tests, check change logs or update documentation etc. It would make for a very predictable model. On top of this choices like this could be configured in the TOML file (eg. update = automatic).

Was this page helpful?
0 / 5 - 0 ratings