Given that probably the most common model for handling errors in rust programs is just to bubble them up, it is far too common to see errors like this:
mqudsi@ZBOOK /m/c/U/m/g/mytube> env RUSTC_WRAPPER=sccache cargo build
error: failed to run `rustc` to learn about target-specific information
Caused by:
process didn't exit successfully: `sccache rustc - --crate-name ___ --print=file-names -C target-cpu=native -C link-arg=-fuse-ld=lld --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=cfg` (exit code: 2)
--- stderr
error: No such file or directory (os error 2)
Note the last line: sccache panicked because it couldn't find some file. Which file? Who knows. You'll need to debug or grep the code to find out.
This is actually because std::io::Error is guilty of the same thing: it just bubbled up the OS error without adding any context.
This particular sccache error is just the error I was dealing with last before filing this issue, but I run into this about every other day dealing with 3rd party crates in the ecosystem. In my own code, all of my crates have to wrap std::io::Error in a new error type that includes the file path and at the site of invocation where std::io::Result is first returned, every std::io::Error gets mapped to a custom error that includes the path in question, if any.
I would like to propose that - at least in debug mode, although preferably always - any IO operations that involve a named path should include that path in their error text, and perhaps even the error type should be extended to include an Option<PathBuf> field (or select APIs returning a different error type).
I know this comes with some caveats. It's hard to use a reference to the path to generate the error text since that would tie the lifetime of the error to (likely) the scope it was called from, making bubbling it up harder, so it would necessarily have to copy the value. This increases the size of std::io::Error for cases where errors are not being bubbled up, but I've been mulling this in the back of my head for some time now, and I think the ergonomics outweigh the cost... and I'm also hopeful someone can figure some clever trick around that.
io::File consists of a unix file descriptor, which lack any filename information. io::File cannot be expanded because doing so violates rust's zero cost abstraction goal.
anyhow::Error allows adding context if you like.
Also, io::{Read,Write,Cursor} need io::Error, but should really be made usable no_std and even without alloc, which limits our options for adding anyhow::Error functionality.
We cannot include this information in a zero cost way, so it would be a price that everyone would need to pay (even if they don't want to). As such, it would not fit well into std; a library providing a wrapper around std which adds relevant context could be viable though. Thanks for the suggestion, though!
If you want to discuss this further, I recommend going to internals.rust-lang.org.
Would it be inappropriate even in debug mode only?
We would need an RFC at minimum to design such a change and I personally am not sure that it's possible given that we only ship one std (which is always compiled in release mode).
(Regardless, this is a sufficiently large request to need an RFC if you want to pursue it further).
Yes, I'd wager most code that benefits from this in debug would benefit in release too, making the debug effort wasted. In C/C++ you must attach such contextual information manually. anyhow::Error improves manual contexts with clean error propagation.
In Linux, you could read file paths from open using /proc/self and fs::read_link
https://stackoverflow.com/a/1189582/667457 It'll returns an incorrect name if they file gets moved, but roughly this should work
fn file_to_path(file: &File) -> Result<PathBuf> {
use std::os::unix::io::AsRawFd;
let fd: i32 = file.as_raw_fd().into();
use fmt::Write;
let mut path = String:: with_capacity(24);
path.write_fmt(format_args!("/proc/self/fd/{}", fd)) ?;
::std::fs::read_link(path)
}
In principle, you could write an anyhow::io crate that duplicates the std::io traits with such functionality. It's plausible one could define some new architecture linux-with-filename-error-context in https://github.com/rust-lang/rust/tree/master/src/libstd/sys but probably easier doing this through new traits.
I object to the idea that adding this is not zero cost. io::Error already contains an enum with several variants that needs to be copied around. File::open and other functions using Path already have to allocate CString for conversions. Adding this allocated string to io::Error is literally two MOV instructions around syscall. Syscalls are much more expensive than two MOV instructions, so if anyone wants to optimize anything, they should optimize number of syscalls, not number of move instructions associated with each syscall.
The benefit of adding Path to io::Error is enormous and should not be dismissed just because it requires two more MOV instructions.
The only issue I see is that paths will not be present in operations after opening the file. However in my experience errors after opening the file are extremely rare and they are pretty much guaranteed to be problems with whole filesystem (full, device unplugged...) in which case the specific file doesn't matter that much anyway.
I'm willing to write the RFC if my arguments convinced you that it's worth pursuing.
I think I've personally been slowly more convinced over time that this is probably the right thing to do if we can pull it off on benchmarks and such (seems not implausible, given existing allocations etc) -- I think the right next step is to kick off some discussion on internals.rust-lang.org and get a read on how libs team feels about this (#t-libs on Zulip is probably best for that). See also https://github.com/rust-lang/rfcs/pull/2979, which tries to codify some of these processes :)
Most helpful comment
I object to the idea that adding this is not zero cost.
io::Erroralready contains an enum with several variants that needs to be copied around.File::openand other functions usingPathalready have to allocateCStringfor conversions. Adding this allocated string toio::Erroris literally two MOV instructions around syscall. Syscalls are much more expensive than two MOV instructions, so if anyone wants to optimize anything, they should optimize number of syscalls, not number of move instructions associated with each syscall.The benefit of adding
Pathtoio::Erroris enormous and should not be dismissed just because it requires two more MOV instructions.The only issue I see is that paths will not be present in operations after opening the file. However in my experience errors after opening the file are extremely rare and they are pretty much guaranteed to be problems with whole filesystem (full, device unplugged...) in which case the specific file doesn't matter that much anyway.
I'm willing to write the RFC if my arguments convinced you that it's worth pursuing.