Rust: Rust distribution uses massive amount of storage space

Created on 23 Nov 2016  路  7Comments  路  Source: rust-lang/rust

I'm a school student trying to work with Rust on lab computers. Students have between 0.5 GB and 1.5 GB of storage space.

My .multirust folder looks like this:

cpaten2@teaching:~/.multirust$ du -h --threshold=5MB
192M    ./toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib
192M    ./toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu
192M    ./toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib
356M    ./toolchains/nightly-x86_64-unknown-linux-gnu/lib
12M     ./toolchains/nightly-x86_64-unknown-linux-gnu/bin
368M    ./toolchains/nightly-x86_64-unknown-linux-gnu
368M    ./toolchains
368M    .

Is any of this superfluous? As it is I'd have to delete everything else in my user account to get Rust to run.

A-rustbuild C-enhancement

Most helpful comment

It seems like before we install them we could check if they're actually byte-identical, and if they are only install a single copy.

All 7 comments

cc @brson

Um if I run a duplicate file finder on the nightly toolchain, each one of the .so libraries is duplicated

There are 39 duplicated .so files, and they are not hard links

duff -r .multirust/toolchains/nightly-x86_64-unknown-linux-gnu/
2 filer i grupp 1 (5089280 byte, hash 79db019d5bca0ec568de06c5c8e87fd4cf402b44)
.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/librustc_driver-1357b93f.so
.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_driver-1357b93f.so

One is the compiler's version and the other is the version built by that compiler. In theory they should be identical by stage2 but I believe we've always been a bit leery about guaranteeing that.

According to duff they are byte by byte equal. A quick hack for @lilred is to use hardlinks to deduplicate them, that saves 160MB.

It seems like before we install them we could check if they're actually byte-identical, and if they are only install a single copy.

Typical rustup update log below. Note that it uses two components, rustc for the host and rust-std for the target; in the common case these overlap, so that's the explanation for the duplicate files.

rustup update nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
info: downloading component 'rustc'
 51.3 MiB /  51.3 MiB (100 %)   7.9 MiB/s ETA:   0 s                
info: downloading component 'rust-std'
 61.0 MiB /  61.0 MiB (100 %)  12.1 MiB/s ETA:   0 s                
info: downloading component 'cargo'
info: installing component 'rustc'
info: installing component 'rust-std'
info: installing component 'cargo'

  nightly-x86_64-unknown-linux-gnu updated - rustc 1.15.0-nightly (d5814b03e 2016-11-23)

This is still a fairly large problem today. We've been increasing the distribution's size; this is primarily because we are adding crates.io crates to the compiler's build which thereby makes their rlibs and dylibs be stored and shipped with the compiler.

  498.2 MiB  /nightly-x86_64-unknown-linux-gnu
  436.8 MiB  /beta-x86_64-unknown-linux-gnu
  385.3 MiB  /stable-x86_64-unknown-linux-gnu
  385.3 MiB  /1.17.0-x86_64-unknown-linux-gnu
  303.5 MiB  /1.8.0-x86_64-unknown-linux-gnu
Was this page helpful?
0 / 5 - 0 ratings

Related issues

nikomatsakis picture nikomatsakis  路  331Comments

Leo1003 picture Leo1003  路  898Comments

nikomatsakis picture nikomatsakis  路  210Comments

withoutboats picture withoutboats  路  211Comments

thestinger picture thestinger  路  234Comments