Runtime: Define and add `FileSystemError` enum and matching property to IOException

Created on 22 Dec 2018  路  62Comments  路  Source: dotnet/runtime

(similar to dotnet/corefx#34220)

Scenario: I want to create and write some default data into a file unless it already exists.

A naive implementation would be:
```c#
if (!File.Exists(path))
{
using (var fileStream = File.Open(path, FileMode.CreateNew))
{
// ...
}
}

The problem with this is that I'm accessing the file system at 2 points in time. As @jaredpar has taught me, `File.Exists` is evil. There's no guarantee that because `File.Exists` returned false, the file still won't exist when calling `File.Open`.

To be robust, we should just call `File.Open` and catch the exception it throws in case the file already exists. The problem is that it throws `System.IO.IOException` with a message of "The file already exists". There is no specific exception type for this scenario. At first it would seem that the only thing we can do is catch the exception depending on its message string (which is a terrible idea), but luckily, there is a specific HResult for this failure, leaving us with:

```c#
Stream fileStream = null;

try
{
    fileStream = File.Open(path, FileMode.CreateNew);
}
catch (IOException e) when (e.HResult == -2147024816) // FILE_EXISTS
{
}

if (fileStream != null)
{
    using (fileStream)
    {
        // ...
    }
}

This works OK but makes the code seem unreadable and maybe even brittle because we're depending on a magic constant that comes somewhere from the Windows API. I'm not sure if this code works outside of Windows at all, but even if it does, it's definitely not obvious.

Please add a dedicated exception type for this kind of failure.

api-suggestion area-System.IO

Most helpful comment

that matches library capabilities of other programming languages (C, Go, ....).

I find it hard to believe that C# programmers would expect the API to follow the patterns used in C and Go rather than C++ and Java.

All 62 comments

For what it's worth, this is something that shows up relatively often in .NET and it can be a little frustrating. Many APIs throw a generic exception type with no way to determine the failure other than checking the exception message or if you're lucky, an HResult.

Please consider this in future APIs especially in an area like IO where this is not a contract exception - these exceptions are meant to be caught and dealt with. There's no way of knowing whether a file system action would fail in advance.

Question: Why not just catch IOException. Why would you need to differentiate between this failure and other IO failures?

Answer: If the file already exists, that's fine and we don't need to do anything in that scenario. But in case of other IO failures, we might want to show an error message to the user, a "retry" dialog or have some other fallback logic.

This makes sense to me, right now we hand the user either 0x80070050 (ERROR_FILE_EXISTS) or 0x10014 (EEXIST). Same for the linked issue, the user has to look for ERROR_DIR_NOT_EMPTY or ENOTEMPTY or whatever it may be on any future OS.

That's doubly unfortunate because .NET is designed around exceptions not error codes. It is clumsy to handle both in the same method. Also, it forces platform specific code - and even on one platform several error codes may map to a single condition for example removing a non empty directory can give ENOTEMPTY or EEXISTS just on Linux. If the caller is not diligent, they may use the message field instead of the code, which will break when localized (or, more often if the OS is localized or changes the text).

If we do anything here, and it would be nice, it should not be bit by bit. We should do an audit to come up with a complete proposal, examining all the places where we overload IOException. In a quick glance it is used for include

  • Treating a directory as a file
  • Permission denied/read only (if not UnauthorizedAccessException)
  • File exists
  • Sharing violation
  • Invalid path in some cases
  • Part of path not found
  • Too large file mapping
  • Too many pipes
  • Invalid handle
  • Pipe disconnected
  • Write failed
  • Compression errors, eg., corruption
  • Copying to itself
  • Too many open files
  • File too large
  • File stream invalid position eg overrwite in append
  • Some registry access errors
  • Many network errors
  • Console errors, semaphores, ..
  • Memory streams
  • Any other IO error that's not specifically handled

Perhaps one reason IOException is so broad may have been a concern that it is easy to break code inadvertently by changing the exception type, if the differences are fine grained. (I know this is the reason that XmlException is used for so many distinct cases.) This does not seem like a good reason, with good unit tests and a stable platform. PathTooLongException, FileNotFoundException, DirectoryNotFoundException, DriveNotFoundException are all good stable fine grained exception types, and all derive from IOException. Also, if code depends on the error code, they already depend on the "type" of error.

As an aside we should also have a first class field for the path, if any. Right now that also has to be parsed out of the message if it's needed and not already known.

@stephentoub @JeremyKuhne @jkotas

We would only want to do this for cases that are significant, worth distinguishing, are not programming errors, are stable, well defined, and can be programmatically distinguished.

Looking through our code I see five potential ones:

  • DirectoryAlreadyExistsException mapping to EISDIR in context of rename/move directory or copy file to path of directory, or possibly when failure due to treating a directory as a file, some ERROR_ALREADY_EXISTS cases
  • FileAlreadyExistsException mapping to ERROR_FILE_EXISTS/EEXIST in context of FileMode.CreateNew, or extracting from zip to existing file, possibly ENOTDIR when treating a directory as a file, some ERROR_ALREADY_EXISTS
  • DirectoryNotEmptyException mapping to ERROR_DIR_NOT_EMPTY/ENOTEMPTY/EEXISTS in context of removing directory
  • UnauthorizedIOException mapping to ERROR_ACCESS_DENIED/EACCES/EPERM/?EBUSY if currently throwing IOException (not UnauthorizedAccessException)
  • SharingViolationException mapping to ERROR_SHARING_VIOLATION/EWOULDBLOCK except where eg it's due to attempting to copy a file to a directory. (Windowsism)

That leaves alone the long tail of random IO errors, and non file system errors (networking, semaphores, ..).

This is just a sketch of a possible choice, I did not analyze carefully.

https://github.com/dotnet/corefx/issues/31917 is related to this. The current file I/O APIs are broken for "file already exist" and other similar situations that the programs often want to recover from gracefully.

We can fix it by either:
(1) Introducing more fine grained exceptions, like what is proposed here. This option encourages using exception handling for control flow that is not desirable.
And/or (2) Introducing new set of non-throwing APIs, like what is proposed in dotnet/corefx#31917. The advantage of this option is that it is more robust better performing design that matches library capabilities of other programming languages (C, Go, ....).

My vote would be to spend energy on (2). It is the more forward looking option.

Are they complementary?

This is an adjustment to help all the existing IO code written using exceptions be a bit more robust, with the addition of little or no code in CoreFX beyond the types - just changing some places we throw IOException to throw something more specific. It is just tuning an existing exception hierarchy.

The other proposal is a new approach to IO, which will need substantial design, establishing a new set of abstract IO codes, etc, ... It will take time, and does not help existing code that is not worth rewriting to the try model.

I do not think they are complementary. The value of (1) is going to be lost once we have (2). Once we have (2), the fine-grained exception hierarchy will be obsolete - the guidance is going to use the new APIs and to not use exception handling for control flow.

I agree that (2) is more work than (1).

We have been in similar situations before. For example, we used to have number of suggestions for new API variants that take arrays or unmanaged pointers. We could have spent energy on adding those and it would certainly be an incremental improvement on top of the existing designs. We have not done that. Instead, we have spent our energy on adding Span that was a superior design. API version of the Innovator's Dilemma.

@jkotas I don't think this is quite the same as unmanaged pointers vs span. With span, you'll be able to use the same code as you would have if the API took an unmanaged pointer. It was worthwhile to wait for the new API because it was a generalization of the proposed one.

New non-throwing IO APIs on the other hand introduce a completely new paradigm. It's not a generalization. It's an alternative way of doing the same thing and it's somewhat a matter of style. I also believe that throwing and non-throwing alternatives should have the same feature set. Having the throwing versions lack certain exceptions for some very specific error codes, while having exceptions for the majority of other ones seems like it would be surprising to any user who isn't familiar with the fact that the non-throwing APIs are newer or that we're trying to encourage them to use the non-throwing API (which won't be obvious at all unless you actually make the exception hierarchy [Obsolete]).

Furthermore, it's not as simple as saying "do not use exceptions for control flow". In this case, you might consider "file already exists" to be part of control flow since it's an expected situation, but other kinds of IO failures would not be as expected and might be handled in a try-catch higher up the call stack. Exceptions would be more convenient for that use case.

that matches library capabilities of other programming languages (C, Go, ....).

I find it hard to believe that C# programmers would expect the API to follow the patterns used in C and Go rather than C++ and Java.

Having both non-throwing and throwing versions of the same APIs is common for parsing or collections in C#. It is not a completely new paradigm. The two variants do not always have the same feature set and the throwing version is missing completely in some cases.

C# programmers would expect the API to follow the patterns

I have said capabilities, not patterns. The capability here is to be able to do non-exceptional file I/O error handling in robust performant way. It means without using exceptions in C# - throwing an exception in C# is a lot more expensive operation than opening a file.

Java

I believe exceptions for control flow are generally more acceptable in Java than in C#. For example, Java has int.Parse only. C# has both int.Parse and int.TryParse.

C++

Exceptions for control flow are generally less acceptable in C++ than in C#. I believe it is fairly common to see error codes returned in C++, e.g. https://en.cppreference.com/w/cpp/filesystem/create_directory. The standard C++ library does not have rich I/O exception hiearchy to encourage using exceptions for control flow.

other kinds of IO failures would not be as expected and might be handled in a try-catch higher up the call stack

This issue can be addressed by the API design, e.g. the API can have enum that says when to throw vs. when to return an error code.

Having both non-throwing and throwing versions of the same APIs is common for parsing or collections in C#. It is not a completely new paradigm.

Sure. I meant it's a new paradigm to use with IO APIs. You will have to migrate your code over to this paradigm. It's very different from your example of pointers and Span.

I have said capabilities, not patterns. The capability here is to be able to do non-exceptional file I/O error handling in robust performant way.

I am not arguing against that, I like that proposal as well. But I think there should be consistency between these 2 paradigms. It's really a matter of style as to which one you prefer. I don't think we should be dictating that people use the new non-throwing if they don't want to deal with HResults.

The two variants do not always have the same feature set

But every time you have a throwing and a non-throwing alternative, the throwing alternative is never less complete - it's the Try-version that can lose some information. Isn't that the case?

I find it hard to believe that C# programmers would expect the API to follow the patterns used in C and Go rather than C++ and Java.

My point here was: users would also expect do to be able to the same thing with the throwing APIs as opposed to having to use the non-throwing one. Not that I don't want the non-throwing APIs to exist. But I think the throwing ones will always be seen as more idiomatic in .NET. It's definitely not something that people will consider to be obsolete.

I have said capabilities, not patterns

I guess what I meant was: "the preferred API to follow the patterns used in C and Go..."

It's really a matter of style as to which one you prefer.

We tend to form a strong opinion about the preferred style during the API design and add the new APIs around it. We do not generally add duplicate APIs to support other styles. Having one preferred style has a lot of value for platform usability. Also, the preferred styles tend to change over time as the new requirements or new fundamental features show up.

I think these are the questions to answer from the API design point of view:

  • What should the one preferred style for non-exceptional file I/O handling be? I am pretty sure that it should be the one that does not use exceptions for control flow.
  • Can we have both fine-grained exception hierarchy and error codes with full parity between them? Yes, but it comes with negative points for duplicate APIs.

I don't think we should be dictating that people use the new non-throwing if they don't want to deal with HResults.

Error codes do not imply HResults. .NET has number of domain-specific error code enums actually, e.g. SerialError, SocketError or WebSocketError. Maybe we should consider adding FileSystemError enum and matching property to IOException. The error code enum is much more lightweight than
fine grained exception hierarchy. And it would work nicely for both exception-throwing and error-code returning APIs and minimize duplication at the same time.

You will have to migrate your code over to this paradigm. It's very different from your example of pointers and Span.

I would expect that migrating the code to use error codes for file I/O would be much easier on average than migrating the code to use Spans. Opening a file and similar operations are usually scoped to a single method that one can change easily. On the other hand, Span is about passing data around. It is not unusual for Spans migration to involve non-trivial refactoring (changing many method arguments from arrays to Spans, etc.).

consider adding FileSystemError enum and matching property to IOException

That sounds fine to me too. I do think something should be done, as we have made it worse when we made it cross platform.

Maybe we should consider adding FileSystemError enum and matching property to IOException.

I think that's fine too.

Hey, I've got a file system error table lying around. The intended usage is

catch (IOException ex) when (ex.HResult == Emet.FileSystem.IOErrors.AlreadyExists)

catch (IOException ex) when (ex.HResult == Emet.FileSystem.IOErrors.AlreadyExists)

I don't think we want to change what the value of HResult is; even if we pick the existing Win32 values (which is rather limiting and do not necessarily match to sufficient distinct causes) that would break anyone expecting the Unix code today. Plus, it's a poor name for a cross platform code. On top of that, it's weakly typed as int rather than a nice enum.

This would require an API proposal:

  1. to add a property to IOException named perhaps Error or IOError (I think FileSystemError is a poor choice as, at least on Windows, it did not necessarily originate with a file system operation.). It would have the type of the enum in 2.

  2. to add a new enumeration named eg IOError and populate it with several useful entries perhaps based on my analysis above. We do not need to think of them all now, it's not a breaking change to add more later, but it would need a (brief) review so better to do it now.

@Neme12 any interest in doing this? The process is here and you would append to your top post because that makes it easier on the review committee.

@Neme12 any interest in doing this? The process is here and you would append to your top post because that makes it easier on the review committee.

Sorry, I was interested in making a proposal for the exception type, but I don't think I'm the right person to determine the shape of a new enum for all IO errors. My knowledge there is very limited. I would feel much more comfortable if someone else took over this.

No problem. I"ll let this sit here for a volunteer.

Maybe the property should be called IOErrorCode to match SocketException.SocketErrorCode, WebSocketException.WebSocketErrorCode and to a lesser extent HttpResponseMessage.StatusCode.

It seems it should include a None or Success value -- although that would never be used in an exception, in dotnet/corefx#31917 it would be returned even on success.

No problem. I"ll let this sit here for a volunteer.

@danmosemsft up-for-grabs?

@MarcoRossignoli yep, but this is a bunch of work up front to define and reach consensus on the cases/values for API review. I made a good start above.

As @jaredpar has taught me, File.Exists is evil. There's no guarantee that because File.Exists returned false, the file still won't exist when calling File.Open

It's not evil, it's a run-of-the-mill race condition :)

File,Open is not very intention-revealing. This doesn't leave a very stable foundation for making generic robust code. File.Open is here to stay, it's foundational, a _wrapper_ to implementation details. It's probably best to treat it as foundational and not use it directly when possible. E.g. Because File.Open is used by many things with different goals it's near impossible to change it (non-breaking) to suit specific needs or scenarios.

So, the real solution is to have new intention-revealing interfaces (methods) that can encapsulate this type of logic. A quick-and-easy intention-revealing naming technique is the _desired outcome_. Since the intention isn't to simply open the file, the intention is to perform an operation on the file, the required outcome is a concrete reader or writer. e.g. the outcome in your example is to have a writable stream or a stream writer. So, we could create a method with a name describing that outcome. Maybe something like "CreateStreamWriter"? But, StreamWriter is a bit vague too, and your example uses Create, so maybe "CreateEmptyFileStreamWriter"? But, wait, don't we have a method that does this already? File.CreateText does exactly what your code describes

Creates or opens a file for writing UTF-8 encoded text. If the file already exists, its contents are overwritten.

So, I think there's a better way to provide value than to create a new FileAlreadyExistsException exception. While I might not think the naming is perfect, there's already something there that provides the _end goal_ without adding a new _means goal_ that gets you to the same end goal without spending much time ensuring a non-breaking change and reaching consensus.

I don't think we want to change what the value of HResult is

Correct. They way my thing lying around works is it's a platform-specific binary. If you build for win32 you get one version of Emet.Filesystems.dll. If you build for Linux you get a different dll. If you build platform-independent the loader selects the correct one at runtime.

@peteraritchie

File.Exists is evil because it returns false on transient IO errors at the hardware level (not a bad disk--SAN was overloaded). I've lost data due to this.

@joshudson Another good reason to provide functionality to get the outcome you need rather than try to change something that can't realistically be "fixed" because it already does what it's supposed to do.

@Neme12 hope you don't mind me changing the title.

I put this into Emet.Filesystems on nuget. IOErrors exports read-only variables that warp to the correct values depending on architecture.

This is still up for grab right ? I would like to do this. If what to do is fixed that is.

Also, this is the first time i am going for this project or open source for that matter. Do you think that its a good start for me ? as it does look straightforward enough. Or should i be looking at something else ?

@SweetShot: You need header files for a bunch of OSes and a lot of patience but not much else. I don't work here but I say go for it.

As up-for-grabs means it has the go ahead, go for it. Worst that can happen is the PR isn't accepted :-)

This is still up for grab right ?

What is up for grabs here is making an official proposal. Part of that proposal could (and probably should) include some level of a sample PR.

@SweetShot do you have what you need to start?

I have sent you a collaborator invite. It is simply so I can assign you this issue. If you accept, it auto subscribes you to the repo. You probably don't want that, and you can switch it off by going to the repo homepage and changing to 'unwatch' at the top.

I have accepted and i will start working on it. As this is my first time, if i need any help will reply here.

Thanks !

+1. Our team uses .NET as a runtime that abstracts out the platform and whether HRESULT or an enum we need a uniform way to identify distinct situations to take corrective actions. And sometimes it is not specific enough at an exception level and therefore HResult is used extensively.

I hope that the current mapping behaviour in https://github.com/dotnet/coreclr/blob/b88f2f635b02c130ae00b4d8aee3e766a60698d5/src/pal/src/file/file.cpp#L3330 will continue, as we depend on it. For example the "write failed" class mentioned by @danmosemsft has sub-cases like ENOSPC mapped to ERROR_DISK_FULL, which we depend on taking corrective action, such as telling the customer to add storage. This is the value of .NET, that we can just check HResult everywhere and not bother about the platform.

Well, that explains why when I tested for the behavior I didn't find it. Only file errors map error codes, not file system errors.

@SweetShot could you give an estimate of your timeline? No pressure or obligation - I realize this is volunteer work 馃槃 but I would also be happy to get this into 3.0, and right now we are 3 previews deep into that schedule. If you're not likely to have much time in the near future, then myself or someone else might be able to take a look.

Hey @danmosemsft I have time to work on it, time is not a problem. So far i could just get the project setup and compile/run tests. Now i will be actually working on this. As i have never worked on this before it will take me time to understand and implement it (especially this is why I cant really give a time estimate on it right now, will be probably able to give by end of this week). So if you need it for next release feel free to pick it up. Meanwhile i will keep working on it, so even if not this time, will be surely able to complete next issue :)

@SweetShot bear in mind there's two separate phases here and you only need to look at the first right now -- which is to make an API proposal. That means in this case
1) Do an analysis of all the places that throw IOException (or a derived exception) in our codebase, and determine what an appropriate "cause" might be for each one. I have a start at that here and here. I would also look through documented Win32 error codes (look for a list on the internet with codes like ERROR_ACCESS_DENIED AND ERROR_INVALID_DRIVE etc) and Unix error codes (like ENOSPC) The goal is to define a proposed enumeration of "causes" that has a reasonable granularity, not too coarse or fine grained, with reasonable names for each one. This enumeration should have minimal or no OS-specific entries, hopefully everything is OS abstracted (like "access denied", "file not found", "disk full" etc). Also pick a name for the enumeration like the FileSystemError proposed above.
2) Create a new issue in the format described in the API review guidelines and link to that.
3) We will gather comments from the community including folks above.
4) I will cause this to go to formal API review.
5) Based on feedback we make any adjustments they require.

The work above involves no actual coding. After that phase the 2nd phase is implementation. It's TBD who actually does that. Could be you if you like and your time allows!

If you schedule allows you to complete item (1) and (2) within the next 1-2 weeks that's should be enough time to do the implementation for 3.0 release.

Let me know if you have any questions at all.

@danmosemsft I think I should be able to do item (1) and (2) from that in given timeline for sure :) Already working on it.

Hey @danmosemsft by

all the places that throw IOException (or a derived exception) in our codebase

you mean only corefx or also coreclr ? because IOException seems to be shared in both with exact same file. I am not sure how coreclr and corefx are linked but a lot of test cases in corefx projects expects IOException thrown by methods/classes from coreclr.

and it seems like the file IOException from corefx and IOException from coreclr are shared even though they have different commit hash (probably because of two different repos). So addition of Enum will affect both ?

Let me know if I have to check in both repos.

@SweetShot IOException born in CoreClr repo and it's automatically mirrored(for compiling needs and code reuse) by a bot on other repos, read markdown https://github.com/dotnet/coreclr/tree/master/src/System.Private.CoreLib/shared
Usually types born in CoreFx repo, but sometimes we need to use these also on "managed" part of CoreClr(System.Private.CoreLib solution https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/System.Private.CoreLib.sln) because for some reason are linked/used by "internal workings of the runtime"
The update of IOException should be done on coreclr repo, base class library(where we expose some types defined on coreclr) and tests lives in corefx.
The "analysis" of usage I think in both, I don't know at the moment if there are some exception intherits IOException on coreclr and exposed to users by corefx.
I hope this helps I'm a contrib and maybe missing something in that case @danmosemsft will correct me!

@MarcoRossignoli Hey, also Im wondering how do I open all files src while loading Test solution for any project. Because I am on windows and non windows files like FileSystem.Unix.cs, FileSystem.WinRT.cs dont show up directly in the solution and i have to do it manually by adding files to solution. Is there a way to configure it to load all files ?

@SweetShot: I believe you can't because they can't be in the same solution at the same time. However, Visual Studio's solution explorer has a Show All Files button. Try it.

@SweetShot AFAIK you cannot load unix files on windows(I use VS Code on linux vm for that), but for analysis you could try with some grep/findstr command on all *.cs files(in past I did it for task like this).

Right, personally I use VS Code and lots of grep/searching and navigating around, making notes.

Got it ! and what about coreclr ? should I look into that too ? So far I have gathered all locations in corefx with IOException with reason/possible enum value. Possible enum names are not finalized but draft is available here

Hello @SweetShot

  1. Yes, please also look in coreclr under src\System.Private.Corelib.

  2. Please let's couch in terms of an enum with standard naming. You might find it easier to use a Github Gist instead of a spreadsheet as it has colorizing and shows changes.

I think your list is a little too fine grained. We do not want very fine grained because it can make it harder to have stable error codes, also it is more API burden, more to document, and more code to write to check the codes. We only need to cover the major cases. Maybe only filesystem ones for now. We can always add more if a use case is proven.

Eg., how about starting with this. What have I missed, or can be removed? Would you reword any of these?
c# public enum FileSystemError { FileNotFound, FileAlreadyExists, DirectoryNotFound, DirectoryAlreadyExists, DirectoryNotEmpty, // When removing a directory DiskNotFound, // or "Volume"? DiskFull, // or "InsufficientSpace" ? DiskNotReady, // Disk exists, but cannot be used. Do we need "DiskReadOnly" ? AccessDenied, // Insufficient privilege FileInUse, // Sharing violation FileTooLarge, InvalidName, // Invalid characters in name InvalidPath, // Path too long, malformed? Same as InvalidName case? InvalidHandle, WriteFailure, // Random write failure, or beyond end, etc. ReadFailure, // Same for read InvalidStream, // Bad stream or invalid location in stream passed to the API InsufficientResources, // too many open files, handles OutOfMemory, CopyToSelf, // Cannot copy over self DecompressionFailure, // Invalid content of compressed file RemotePathNotFound, // ??? something for trying to write to \\doesnotexist\foo ? server not found? share not found? Undefined, // We would set this for all other cases, including non filesystem issues, hopefully the HRESULT is useful instead, or at least the message text... }

Hello @danmosemsft

I have looked into places with IOException in coreclr shared, it doesn't have any newer categories that can be added to the FileSystemEnum.

I validated proposed enum against all possible causes, considering only filesystem for now, this enum covers all cases but Move across volumes (which i have added). I also have added more comments for extra errors where same enum value can be mapped.
As far as InvalidName and InvalidPath is concerned I think it should be kept separate as it comes from 2 distinct coding errors by developers and not cause of os.

Also InsufficientResources and OutOfMemory sounds similar at least to me, while we know they denote two separate conditions. Maybe InsufficientResources can be named as TooManyHandles it will also include files as files are handles anyway.

    public enum FileSystemError
    {
        FileNotFound,  
        FileAlreadyExists, 
        DirectoryNotFound,
        DirectoryAlreadyExists,
        DirectoryNotEmpty,   // When removing a directory
        DiskNotFound, // or "Volume"? Same as device unavailable?
        DiskFull, // or "InsufficientSpace" ? 
        DiskNotReady, // Disk exists, but cannot be used. Do we need "DiskReadOnly" ? 
        MoveAcrossVolumes, // can be clubbed with InvalidPath ? i.e. dest path for move?
        AccessDenied,  // Insufficient privilege, Also for Invalid Mode? write on readonly? bad file mode?
        FileInUse, // Sharing violation, Already open 
        FileTooLarge,
        InvalidName,   // Invalid characters in name 
        InvalidPath, // Path too long, malformed? Same as InvalidName case?
        InvalidHandle, 
        WriteFailure,  // Random write failure, or beyond end, etc. also for io_http/net_io
        ReadFailure,   // Same for read
        InvalidStream, // Bad stream or invalid location in stream passed to the API, maps to SeekBeforeBegin/AfterEnd
        TooManyHandles,   // too many open files, handles, 
        OutOfMemory, // Memory mapped files, 
        CopyToSelf, // Cannot copy over self
        DecompressionFailure,  // Invalid content of compressed file maps to unexpected end of stream, dir name with data, extract outside zip
        RemotePathNotFound, // ??? something for trying to write to \\doesnotexist\foo ?  server not found? share not found?
        Undefined,  // We would set this for all other cases, including non filesystem issues, hopefully the HRESULT is useful instead, or at least the message text...
    }

You've missed quite a lot that I've actually had to handle.

InvalidFunction or NotSupported, // Windows calls this InvalidFunction while *nix calls this ENOTSP; not sure which is better
TooManyLinks,  // Too many hard links to this file
OutOfFiles, // cannot create any more files on this device
TooManySymbolicLinks, // ELOOP path traversal tried to descend too many symbolic links
ReadChecksum, // disk reported CRC error on read
IsADirectory, // tried a file operation on a directory
IsNotADirectory, // tried a directory operation on a file
NotATerminal , // tried a terminal operation on a pipe
NotSeekableDevice, // tried to seekon something that can't be seeked on
BrokenPipe, // tried to write to a pipe when the reader is closed
QuotaExceeded, // there's plenty of space on the disk, but not for this account
OperationNotSupportedOnSymbolicLink, // Windows does not allow hard link to symbolic link
RemoteIOError, // WriteFailure or ReadFailure on the other end of a network filesystem reports this; I suppose you *could* fold this into WriteFailure  or ReadFailure; I only handle this one specifically for generating better error messages.
DeletePending // tried to open a file with a pending delete; can possibly be folded into FileInUse

.

InsufficientResources,   // too many open files, handles

That name is rather horrible. TooManyOpenFiles would be much better. Ninad Sheth is already confused by the name.

Also, clubbing MoveAcrossVolumes into something else would be bad for me. I handle this one specifically too.

Thanks @joshudson.

Some of these are fine grained distinctions.

  1. Would the caller know what to test for? For example if I try to create a directory at the location of an existing file, would the caller know whether to expect IsADirectory or FileAlreadyExists?
  2. Can we make the distinctions all platforms from the error code without extra cost? (For example, when CreateDirectory fails on Windows with ERROR_ALREADY_EXISTS, we already have to do more IO to determine whether it was a file or a directory that already existed.)
  3. Are these meaningful and well defined on all platforms, if that matters? Part of the purpose here is abstracting platforms.
  4. What granularity does the app need? If it is not much (eg app wants to choose between "ask user for a new location" and "just show IO failed") then testing against many possiblities means some verbose code -- unless this becomes a big [Flags] enumeration with some catch-all entries...
  5. Is Undefined an appropriate catch-all given in many cases we will have a well defined distinct error code, it just simply isn't represented in the enumeration (yet?)

For you @joshudson I'm guessing you want as granular as possible, since you're using HRESULT already. @arunjvs is this also true for your case?

For example if I try to create a directory at the location of an existing file, would the caller know whether to expect IsADirectory or FileAlreadyExists?

IsADirectory is never the result of a name collision; it is only ever the result of trying to do something like new FileStream(@"C:\", ...); Same general idea for IsNotADirectory

On looking back at it, I think we can fold QuotaExceeded into DiskFull.

Specific cases that come up a lot:

try {
  // ...
   CreateHardLink()
   // ...
} catch (InvalidFunction) {
    /* Use the FAT algorithm */
}

try {
    new FileStream(...)
} catch (IsADirectory)
{
    /* recursive call */
}

try {
    file.Position = somebigvaule;
} catch (NotSeekableDevice) {
    /* read-discard loop to get to new position */
}

for (int tries = 0; tries < number; tries++)
{
   try {
       file = new FileStream(...)
    } catch (FileInUse)
    {
         Threading.Sleep(30);
    }
}

try {
    DeviceIOControl()
} catch (NotATerminal ) { /* Yoda would approve */ }


try {
} catch (BrokenPipe) throw new ThreadAbortException();
} catch (IOException) { /* stuff to do */
}

In theory, the following reduce mappings work well enough:

  • DirectoryNotFound -> FileNotFound
  • QuotaExceeded -> DiskFull
  • OutOfFiles -> DiskFull
  • QuotaExceeded -> DiskFull
  • DecompressionFailure -> ReadFailure
  • InvalidHandle -> die! (throw something uncatchable)

I have never had to catch InvalidStream but only because seeking past the end is valid.

I created a table of the error code mappings I know of and where possible, specific causes. I also checked this against the various mappings in .NET Core itself. If you have suggestions to improve/fill in this table, let me know and I will update it.

Perhaps this will help us identify a reasonable granularity of causes, that are also meaningful across platforms.

https://gist.github.com/danmosemsft/ee382954ebdda9807d54a569dd662eb4

@joshudson @sweetshot @tmds any feedback on my table?

Incidentally, as a separate issue, one coudl argue for a Path field on IOException. Most often it is only accessible by parsing the string.

I posted some comments on the gist. I don't see anything to add to them.

@joshudson my bad, I missed those somehow. thanks.

@danmosemsft I had not picked up this issue, I'll take a look at the issue and table coming week.

@tmds for sure no urgency on this one, at this point it must be a post 3.0 change if we do anything here.

I've read through the issue. Once the enum is there, it's Undefined value can later be split into more specific values (the user should add a catch IOException, and not check Undefined). Other values can not be split, because that will break existing code.

The added-value of the enum is to be able to handle certain specific values (Exists being the one requested in the issue). Programs won't be able to handle all cases, so for some errors, the UI will show the Exception.Message.

Perhaps we should keep the enum small to begin with, and map most errors to Undefined.
We can see on a per method basis what error the programmer may want to explicitly handle.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

EgorBo picture EgorBo  路  3Comments

jchannon picture jchannon  路  3Comments

GitAntoinee picture GitAntoinee  路  3Comments

yahorsi picture yahorsi  路  3Comments

matty-hall picture matty-hall  路  3Comments