Problem
$ rustup component remove rust-docs
info: removing component 'rust-docs'
info: rolling back changes
error: could not create temp directory: /acct/jynelson/.local/lib/rustup/tmp/ibqoveyhjujffe52_dir
# I ran cargo clean so I had enough disk space
$ rustup component remove rust-docs
info: removing component 'rust-docs'
warning: during uninstall component rust-docs was not found
$ ls ~/.local/lib/rustup/toolchains/stable-x86_64-unknown-linux-gnu/share/
doc man zsh
$ rustup toolchain install stable
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
stable-x86_64-unknown-linux-gnu unchanged - rustc 1.47.0 (18bf6b4f0 2020-10-07)
error: rustup is not installed at '/acct/jynelson/.local/lib/cargo'
Notes
Output of rustup --version: rustup 1.20.2 (13979c968 2019-10-16)
Output of rustup show:
Default host: x86_64-unknown-linux-gnu
rustup home: /acct/jynelson/.local/lib/rustup
I'd test with a newer version, but apparently rustup self update is also confused:
$ rustup self update
error: rustup is not installed at '/acct/jynelson/.local/lib/cargo'
rustup shouldn't be installed at .local/lib/cargo anyway - have you set some environment variables perhaps? what does 'rustup show' output?
This might be a case of https://github.com/rust-lang/rustup/issues/2417 but I doubt it, I think something different is going on here.
I have no idea what we do when we get ENOSPC trying to make a tempdir, nor how we ended up with a broken set of metadata. How to attempt to reproduce this will be "interesting"
@rbtcollins here's a smaller reproduction, with only CARGO_HOME set but not RUSTUP_HOME.
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$ cat /dev/urandom > oops-too-big
cat: write error: Input/output error
cat: write error: Disk quota exceeded
$ rustup component remove rust-docs
info: removing component 'rust-docs'
info: rolling back changes
error: could not create temp directory: /acct/jynelson/.rustup/tmp/mr5scujdx35d5xjb_dir
$ rustup component remove rust-docs
info: removing component 'rust-docs'
warning: during uninstall component rust-docs was not found
Strangely, rustup self update is no longer broken, and I can now uninstall the whole toolchain:
$ rustup self update
info: checking for self-updates
rustup unchanged - 1.22.1
$ rustup toolchain uninstall stable
info: uninstalling toolchain 'stable-x86_64-unknown-linux-gnu'
info: toolchain 'stable-x86_64-unknown-linux-gnu' uninstalled
So I consider this 'works for me'. But it would still be nice to fix the inconsistent toolchain state.
@kinnison here's that strace:
write(2, "\33[1m", 4) = 4
write(2, "info: ", 6info: ) = 6
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33(B\33[m", 6) = 6
write(2, "removing component 'rust-docs'", 30removing component 'rust-docs') = 30
write(2, "\n", 1
) = 1
statx(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", O_RDONLY|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
read(3, "cargo-x86_64-unknown-linux-gnu\nc"..., 213) = 212
read(3, "", 1) = 0
close(3) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
getrandom(NULL, 0, GRND_NONBLOCK) = 0
getrandom("\x7f\x61\xbf\x93\x73\x6d\xe7\x31\x49\x53\xb0\xbb\xee\xfc\x51\x0c\x6d\x02\xa8\x25\x9e\xc6\x48\xba\xb3\xff\x50\xce\x7c\x35\x7b\x76", 32, 0) = 32
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7ffe97eb8710) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3
close(3) = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
read(3, "cargo-x86_64-unknown-linux-gnu\nc"..., 8192) = 212
read(3, "", 8192) = 0
write(4, "cargo-x86_64-unknown-linux-gnu\nc"..., 177) = 177
close(4) = -1 EDQUOT (Disk quota exceeded)
close(3) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7ffe97eb8110) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3
close(3) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", O_RDONLY|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0100644) = 4
statx(4, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0
fchmod(4, 0100644) = 0
copy_file_range(3, NULL, 4, NULL, 212, 0) = 212
close(4) = -1 EDQUOT (Disk quota exceeded)
close(3) = 0
rename("/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components") = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/manifest-rust-docs-x86_64-unknown-linux-gnu", O_RDONLY|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=24, ...}) = 0
read(3, "dir:share/doc/rust/html\n", 50) = 24
read(3, "", 26) = 0
close(3) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp/tx3h0o7z7p7_09_y_dir", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7ffe97eb8160) = -1 ENOENT (No such file or directory)
mkdir("/acct/jynelson/.rustup/tmp/tx3h0o7z7p7_09_y_dir", 0777) = -1 EDQUOT (Disk quota exceeded)
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7ffe97eb87f0) = -1 ENOENT (No such file or directory)
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33[1m", 4) = 4
write(2, "info: ", 6info: ) = 6
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33(B\33[m", 6) = 6
write(2, "rolling back changes", 20rolling back changes) = 20
write(2, "\n", 1
) = 1
rename("/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components") = 0
statx(AT_FDCWD, "/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", AT_STATX_SYNC_AS_STAT, STATX_ALL, 0x7ffe97eb8320) = -1 ENOENT (No such file or directory)
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33[31m", 5) = 5
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33[1m", 4) = 4
write(2, "error: ", 7error: ) = 7
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\33(B\33[m", 6) = 6
write(2, "could not create temp directory:"..., 33could not create temp directory: ) = 33
write(2, "/acct/jynelson/.rustup/tmp/tx3h0"..., 47/acct/jynelson/.rustup/tmp/tx3h0o7z7p7_09_y_dir) = 47
write(2, "\n", 1
) = 1
sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=8192}, NULL) = 0
munmap(0x7fa25274e000, 12288) = 0
exit_group(1) = ?
+++ exited with 1 +++
So the syscall sequence which fills me with the heebie jeebies in particular is:
openat(AT_FDCWD, "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components", O_RDONLY|O_CLOEXEC) = 3
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=212, ...}) = 0
openat(AT_FDCWD, "/acct/jynelson/.rustup/tmp/lq90se25u5caz4hl_file", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0100644) = 4
statx(4, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_BASIC_STATS, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0
fchmod(4, 0100644) = 0
copy_file_range(3, NULL, 4, NULL, 212, 0) = 212
close(4) = -1 EDQUOT (Disk quota exceeded)
close(3) = 0
rename("/acct/jynelson/.rustup/tmp/3buply84_zc_1f8f_file", "/acct/jynelson/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/components") = 0
I smell transactional rollback leaping blindly into the unknown... which if correct makes it a case of #2417 indeed, at least for the corrupt components list
Hmm no the rollback starts later; the corruption here starts when we have a failure signalled via close but not detected in our code.
I suspect we need some file::flush() calls to fix this - note that this doesn't imply fsync on rust.
https://github.com/rust-lang/rust/blob/4c0c5e099a3b1f1c6ad53115189c2710495588b3/library/std/src/sys/unix/fs.rs#L861
So from a performance perspective this is a no-op: all we are doing is just ensuring we have flushed the intermediate buffers that are accumulating the content out to the OS before the drop-is-close semantics kick in. For most-but-not-all file systems that will be enough to get this error earlier. Except NFS, where we can't flush the OS buffer short of an fsync, and that - well we may have to for some select files, but as it won't actually fix the out of space issue, we'll need to think more on this anyway: the goal here should be to not corrupt things, rather than 'succeeding'.
Though the fact that copy_file_range succeeded and close(4) failed leaves me in some doubt about this, and possibly we are just flat out of luck and really really we need close() as an actually fallable syscall here.
We can access close() ourselves by getting the fd via https://doc.rust-lang.org/std/os/unix/io/trait.AsRawFd.html + https://docs.rs/libc/0.2.65/libc/fn.close.html
@rbtcollins if you want to push a change using flush() or sync_all(), I can build rustup from source and see if that helps (as opposed to a full close() call).
There's some chance that sync_all would force the error to be exposed, but it's also a crippling performance impact to make, so other than exploring the space to understand things, I really don't want to do that. And flush() is a no-op on the file, so for our case here I think it will have no effect.
To keep all the info in one place, there is https://crates.io/crates/close-file but it's not quite right in the head. Alex also wrote https://gist.github.com/alexcrichton/0489d44efb7b3a6aa96fae044dd1be23 which is a bit better though still not fully correct or ideal