Virtual-environments: Old cargo cache left over from the build in Windows and Linux

Created on 18 May 2020  路  13Comments  路  Source: actions/virtual-environments

Describe the bug
Parts of the ~/.cargo file tree baked into the virtual environments windows-latest and ubuntu-latest contain cached crate registry data left over from the build.

Area for Triage:
Rust

Question, Bug, or Feature?:
Bug

Virtual environments affected

  • [ ] macOS 10.15
  • [ ] Ubuntu 16.04 LTS
  • [x] Ubuntu 18.04 LTS
  • [ ] Windows Server 2016 R2
  • [x] Windows Server 2019

Expected behavior
The only contents of the ~/.cargo directory that should be present in the pristine virtual environment are the Rust toolchain binaries in bin/ and the env script, and possibly some configuration dotfiles. The subdirectories registry and git cache dependencies for package builds and should either be left empty or not be present.

Actual behavior
A description with steps to reproduce the issue.

  1. Run this GitHub workflow:
name: Test cargo behavior

on: push

jobs:
  generate-lockfile:
    strategy:
      matrix:
        expunge-index: [true, false]
    runs-on: windows-latest
    steps:

      - uses: actions/checkout@v2

      - name: Install Rust toolchain
        uses: actions-rs/toolchain@v1
        with:
          profile: minimal
          toolchain: stable
          override: true

      - if: ${{ matrix.expunge-index }}
        name: Clear cargo registry index
        run: |
          rm -r -fo ~\.cargo\registry\index

      - run: cargo generate-lockfile -vv
  1. Check the time it takes to run cargo generate-lockfile in each of the two jobs.

Cargo attempts to check out the head of the crates.io registry index repository into a predictably named directory under ~/.cargo/registry/index.
Apparently, if the working copy directory is already present, but its local head commit is way back in the current history of the frequently updated index, the git backend ends up doing something that works way slower on Windows than a clean clone. This should probably be fixed in cargo, but it will take more effort than doing a bit more post-build cleanup here.

There are also crate sources left in ~/.cargo/registry/cache and ~/.cargo/registry/src; the whole thing looks like a leftover from a cargo install performed during the build of the virtual environment. These files need to be removed, as they only bloat the images.

Rust Ubuntu Windows investigate

Most helpful comment

I would personally say that it's a nice feature to have a pre-populated cache in all the images. That should help amortize the initial update of the index since much of it is already downloaded.

If Windows is super slow though in bringing the registry forward that's definitely something that should be fixed on Cargo's side!

All 13 comments

This and this are the culprits for the crate build. The cached crates are from the dependency graph of the bindgen tools.
Curiously, bindgen and cbindgen do not seem to be installed on macOS. I wonder if this has been overlooked.

Hi, @alexcrichton,
What do you think about safety removing ~/.cargo/registry/cache and ~/.cargo/registry/src folders after installing cargo install bindgen cbindgen?

I would personally say that it's a nice feature to have a pre-populated cache in all the images. That should help amortize the initial update of the index since much of it is already downloaded.

If Windows is super slow though in bringing the registry forward that's definitely something that should be fixed on Cargo's side!

I would personally say that it's a nice feature to have a pre-populated cache in all the images. That should help amortize the initial update of the index since much of it is already downloaded.

I would expect there to be more flexible caching schemes than baking in an index working copy that is only updated with the frequency of the virtual environment image. The index I see in a test run made today had its head commit made on Apr 30. It does not currently make the index sync much faster on Windows, compared to the time it takes to clone from scratch.

With the Cargo.lock checked into the repository, it's easy to organize a cache that would hit the perfect registry state for the build most of the time, or fall back to one that's close enough.

For testing Rust libraries, more elaborate caching schemes are possible. I've got one that synchronizes the registry index from a "cold" state only once, and only fetches that part of the registry cache before it generates the Cargo.lock that is used for keying the cache for everything else. I intended to have the index pre-restored from a differently affine cache too, but cargo generate-lockfile works fast enough on the current vanilla ubuntu-latest environment to bother. Now with the same test workflow linked above, I figure that it's exactly the accidental pre-populated index that makes it fast on Linux. Still, proper caching should make it reliably fast.

I don't think that the dependency sources from a bindgen build are very good pre-population material, either; I expect most workflow builds to lock on different versions of the crates and depend on too many other crates, so this occasionally pre-populated cache won't be of much help. In any case, it's safe to remove src, as any missing sources are auto-unpacked by cargo from the crate archives in cache.

This is not merely a performance problem, it's causing caching failures on Windows: https://github.com/actions/cache/issues/198#issuecomment-633390550

The images with cleaned cargo cash were deployed.
@mzabaluev I'm going to close the issue, but feel free to contact us if you have any concerns.
Thank you!

The directory is confirmed clear on Windows, but the changes do not seem to be deployed on ubuntu-latest yet: test run

@mzabaluev thanks for the update!
@vmapetr could you take a look, please?

FYI, it's a little bit surprising that recursively listing or removing a nonexistent directory can take 15 or even more than 30 seconds on Windows.

Hi @mzabaluev! Sorry for the late reply, does this issue is still actual for you?

@AlenaSviridenko It was actual insofar as I had to add annoying workarounds on Windows.
Now that the Windows environment has been cleaned up, there should be no more ill effects, apart from the waste of filesystem space in the Linux environment. I don't know if this might affect set up times.

@mzabaluev thanks for the update. I've prepared PR with the fix for Ubuntu https://github.com/actions/virtual-environments/pull/1217

@mzabaluev /home/runner/.cargo/registry is empty across all ubuntu images now. I'm going to close the issue.
Thank you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

trajano picture trajano  路  3Comments

Tnze picture Tnze  路  4Comments

estebanes22 picture estebanes22  路  3Comments

shogo82148 picture shogo82148  路  3Comments

Poolitzer picture Poolitzer  路  4Comments