Crystal: [RFC] Getting rid of Errno

Created on 20 Feb 2020  Â·  29Comments  Â·  Source: crystal-lang/crystal

Replacing Errno

Currently there are many places in the standard library where an Errno exception is raised when a syscall or a call to a C function returns an error condition. The value comes from the errno value and we use the system provided error message.

This approach has some issues, most of them already discussed (#8305, #5003):

  • Portability: even though errno should be defined by POSIX, different platforms have different error codes. Also Windows has a separate set of error codes.
  • Abstraction leak: the Errno exception handled at the user code it's sometimes leaking the syscall being executed underneath.
  • Hard to handle: there is a single Errno class (with no subclasses). Doing a rescue Errno catches all kind of errors. Also some IO functions raises either a IO::Error or Errno and callers must rescue from both classes.

Proposed replacements

Instead of raising a generic Errno, some specific exception classes will be used depending on the are where the error occurs. All these errors will contain, at least, the error message provided by the system for the errno code.

Note: These are the subclasses that I found so far necessary to add to the hierarchy but it's not by any means finished and I'm open to suggestions to add more.

IO::Error

IO module is where most of the errors coming from the system are raised. It makes sense because it's where most of the side effects happen.

IO::Error will replace Errno for all the general errors. However some subclasses of it will be created.

IO::Timeout

This one already exists, but it's not a subclass of IO::Error. This will be fixed.

IO::FileSystemError

This subclass of IO::Error will be raised for operations with the filesystem (mostly from File and FileUtils classes)

Some subclasses might also be necessary. So far I found the need to create a IO::NotFoundError so it can be handed specifically for some situations.

Other subclasses we could add: AccessDeniedError, FileExistsError, NotDirectoryError, etc.

IO::SocketError

Similar to the previous exception, but for errors directly related with a network socket.
There is currently a Socket::Error, so we can actually rename it and make it a subclass of IO::Error (Socket functions currently raises a mix of IO::Error and Socket::Error).

RuntimeError

For other situations where a system error occurs, a RuntimeError will be raised. For example, issues with threads, processes

Exception Hierarchy

+- RuntimeError
+- IO::Error
   +- IO::FileSystemError
   |  +- IO::NotFoundError
   |  +- ...
   +- IO::SocketError
   +- IO::Timeout
refactor discussion stdlib

Most helpful comment

I agree with @straight-shoota. Each namespace should contain their own set of errors. IO is an abstraction and musn't care about file system or socket errors that happen to use IO.

For example ENOENT (no such file or directory) has little to do with IO, it's related to the file system (open) _before_ actual IO (read, write). I think it would fit better as a File::NotFound < File::Error instead. Same for EPERM, EEXISTS, EMFILE, ...

All 29 comments

A few notes on naming:

  • IO::Timeout should also be renamed to IO::TimoutError to express its nature.
  • IO::FileSystemError: File and Dir are not part of IO. The error seems misplaced in IO namespace.
  • IO::SocketError is similar.

I'm also with @RX14 in https://github.com/crystal-lang/crystal/issues/8305#issuecomment-544484046 that these specific error types should have properties with specific error details.

There's also the unresolved discussion in #5003 about low-level access to errno numbers. I suppose this would probably be mostly relevant for RuntimeError which could - as I understand it - represent any kind of error that can be returned by a libc call.

Since this are all Errors eg. IO::TImeoutError should they be namespaced under error?

  • Error::IO::Timeout
  • Error::IO::FileSystem
  • Error::IO::Socket

I agree with @straight-shoota. Each namespace should contain their own set of errors. IO is an abstraction and musn't care about file system or socket errors that happen to use IO.

For example ENOENT (no such file or directory) has little to do with IO, it's related to the file system (open) _before_ actual IO (read, write). I think it would fit better as a File::NotFound < File::Error instead. Same for EPERM, EEXISTS, EMFILE, ...

I disagree, we should move File into IO::File instead 😆

Seriously speaking, I think any file system operation actually has to do a lot with IO. File system IS I/O. The problem here is that the module that abstracts binary streams is called IO, just like in Ruby. If we were using Stream instead, IO::Error could be used both for Stream and File errors with less controversy.

I forgot to mention in my post, but I took some inspiration from Java's FileSystemException. Even though the exception is defined in a different namespace, it still inherits from IOException.

We need to think in the practical use of these exceptions. There might be calls that could fail either at a filesystem operation or at the binary I/O. And it would be unfair for the caller to always remember to handle IO::Error and File::Error each time. We could move FileSystemError into the File namespace, but I still would make it inherit from IO::Error because of it nature.

Regarding IO::Timeout, I think the word "timeout" already stands for an error. Not the same with "file system". We don't have a strict rule in Crystal, just like in Ruby, for naming classes, and I'm ok with that. Given that said, I don't have a strong opinion on that, so if most people think IO::Timeout is not clear enough and don't mind about tiping five more characters each time, I'm open to rename it :)

One more think I forgot to mention. Just like Java's FileSystemException, FileSystemError have a file property with the path of the problematic file. And an optional other with another path for those errors related with two files (move, link, etc.).

I'm not a big fun however of exposing the original errno into the exceptions. It leads to non-portable code again. For those errors that have a probable use in error handling, let's create subclasses. Otherwise, I think most of the errors will be just logged, so exposing more detail in a structured way will just overcomplicate things.

Having File::Error inherit IO::Error sounds fine.

There are lot's of methods in IO dealing with timeout in a sense that's not immediately related to error handling. So IO::Timeout should be clear what it is about. IMO all error types should end with Error. I don't see any benefit from omitting that bit of explicitness just to save a few key strokes (which is completely irrelevant with autocomplete).

where the exception came from (filesystem, socket, misc.) is less important than what the error was (timeout, address resolution, file not found). So I'd prefer to not go crazy with subclassing depending on where the error was raised. Many of the filesystem errors can only be raised from filesystem code.

@straight-shoota you're right, that's a good observation about timeout

So, the hierarchy moves to something like this:

+- RuntimeError
+- IO::Error
|  +- IO::TimeoutError
+- File::Error < IO::Error
|  +- File::NotFoundError
|  +- File::AccessDeniedError
|  +- File::AlreadyExistsError
|  +- ...
+- Socket::Error < IO::Error

I like it. It reads a little bit better.

One think I still cannot make my mind yet is about the original errno. As I said a few minutes ago, I'm opposed of exposing it within the exceptions because it might lead to write non-portable code. And most of the time it will be just logged out and not handled in any specific way. On the other hand what if we didn't think in advance with enough completeness, and there is actually a scenario where the original error code is needed. Or what if logging just the translated errno to the corresponding message makes it harder to debug.

I see three options:

  1. The portability argument wins: let's hide the OS and errno and create as many exception classes as we need
  2. Let's add the errno within exceptions, maybe with a more generic and non-OS specific name (error_code?) but let's clarify in the documentation that we discourage writing code that depends on this value
  3. Let's raise another lower-level exception from the System classes (UNIXError, WinError) and wrap them on the previously described exceptions, setting them as the cause. Keep in mind that exceptions are slow, so raising twice have a performance impact.

What do you think?

3. Let's raise another lower-level exception from the System classes (UNIXError, WinError) and wrap them on the previously described exceptions, setting them as the cause. Keep in mind that exceptions are slow, so raising twice have a performance impact.

You don't neccesarily need to raise twice to provide a cause. I actually like this idea!

My take on this is quite the opposite, probably because I see cross-platform language platforms as a noble goal, but questionable utility. Either way, please don't take away my ability to operate within POSIX environment. When I see something like "I'm not a big fun however of exposing the original errno into the exceptions" I start to become nervous. I don't need anything beyond Linux and never will, please leave me my libc as is.

Having said that, Crystal's inability to seletively rescue Errnos is an invonvenience, leading to re-raising what shouldn't have been caught in the first place, so what has been proposed and rejected already is a Ruby-like Errno hierarchy. I still don't understand the reasons Crystal doesn't do this. Something to do with compiler optimisation? Too many classes?

I like this last hierarchy, too.

The problem here is that the module that abstracts binary streams is called IO

You're right. What we call IO is actually a stream :+1:
It's still kinda weird to have an IO::TimeoutError when a select is failing, or some other cases (a non responding Actor, a Future computation taking forever). Mostly because of this naming confusion, thought.

One think I still cannot make my mind yet is about the original errno.

  • We better not raise twice.
  • As long as the raised exception is explicit enough, there is no need to wrap the underlying Errno, and it would defeat the whole purpose of this RFC.
  • Generic exceptions (RuntimeError, SystemError) may need to permit access to the underlying errno or WinError value to not hinder the ability to react to them properly.

NOTE: there are too many WinError values to wrap them all.

I like this last hierarchy, too.

The problem here is that the module that abstracts binary streams is called IO

You're right. What we call IO is actually a stream :+1:
It's still kinda weird to have an IO::TimeoutError when a select is failing, or some other cases (a non responding Actor, a Future computation taking forever). Mostly because of this naming confusion, thought.

One think I still cannot make my mind yet is about the original errno.

My take:

  • We better not raise twice.
  • As long as the raised exception is explicit enough, there is no need to wrap the underlying system error, that would defeat the purpose of this RFC.
  • There are many errno values, and much more WinError values to consider mapping them all —we'd flood the compiler with classes.
  • Generic exceptions are needed in addition, and should give access to the underlying error value to not hinder the ability to react on them.
  • I'm not fond of per-target exceptions like UNIXError or WinError. Going along with the RFC, I'd prefer a generic SystemError.
  • I don't know how to report the error value... maybe a generic #value returning either an Errno or WinError enum value? We can't return a mere Int32 since it would be confusing whether we got an errno or WinError otherwise.

So the list above contains 81 errnos, is that a problem for compiler? That's an interesting question for my own future architectural decisions.

@stronny Yes, it's a bit slower: https://github.com/crystal-lang/crystal/pull/445#issuecomment-75605400

Personally I would happily trade this amount of time for nice classes but I guess I'm fine with status quo.

@stronny I'm pretty sure most other users wouldn't agree with that trade-off because it only offers a benefit for very system-specific applications. Still, the pretty limited use cases of libc functions in stdlib actually don't need to cover many different errors at all. That's why a few specific error types as suggested in this RFC suffice.

If you do custom calls to libc functions in your code, you can obviously make them raise whatever exception you want.

I disagree with this notion about system specificity, but as I've said I'm fine with current status quo. The proposed RFC seems unhelpful to me:

  1. Portability is an illusion in my opinion. I have no idea how you would develop correct programs that satisfy both POSIX and winapi architectures, but I have some ideas how that would hurt my ability to develop correct pregrams with no such goal in mind. What I would suggest is developing two different programs, moving all non system specific logic into a library that could be included from both places. Many more or less successful portable solutions do not operate on low level, requiring a system specific runtime, Crystal however does link native binaries.

  2. Abstraction leak is not a problem. Crystal is a low level language with GC.

  3. Hard to handle is a bit of an irritation indeed, however moving _some_ errnos into classes does not make it go away, fragmenting the semantics and turning a universal pattern, if dated and inconvenient into several special cases.

This is all of course my personal opinion.

Portability is an illusion in my opinion

The same Java code usually works pretty well in different OSes. I believe the same should apply to Go. C# has done an effort in that regard too. I think D, Nim and Rust are running the same in different platforms. I don't see why Crystal should not do this.

Abstraction leak is not a problem. Crystal is a low level language with GC.

Crystal is a high level language with low level constructs used to build the abstractions. But it's mainly a high-level language.

But yes, it's also my personal opinion, but also the way we designed and are designing the language.

For the record, Python uses an exception hierarchy like this, but all these
exceptions still derive from OSError which contains a platform-specific
error value. Thus, you can still access errno if you know you're on Posix,
but otherwise you can safely ignore it and use the platform independent
exception hierarchy.

On Tue, Mar 3, 2020, 7:58 AM Ary Borenszweig notifications@github.com
wrote:

Portability is an illusion in my opinion

The same Java code usually works pretty well in different OSes. I believe
the same should apply to Go. C# has done an effort in that regard too. I
think D, Nim and Rust are running the same in different platforms. I don't
see why Crystal should not do this.

Abstraction leak is not a problem. Crystal is a low level language with GC.

Crystal is a high level language with low level constructs used to build
the abstractions. But it's mainly a high-level language.

But yes, it's also my personal opinion, but also the way we designed and
are designing the language.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/crystal-lang/crystal/issues/8827?email_source=notifications&email_token=AAM4YSKZE3GIJ2M6467DBUTRFUEHNA5CNFSM4KYVQEMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENTSSMI#issuecomment-593963313,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAM4YSKHWKDLTJPXGB25D4TRFUEHNANCNFSM4KYVQEMA
.

I won't argue my case further, I just find it peculiar that you compare Crystal to Java and C#. Maybe I'm just not aware of huge existing codebases, but my understanding is that Windows development does not usually involve Go, D, Nim or Rust.

When I coded in D I was using Windows and I made a native GUI app. So I think D in Windows works perfectly fine. Obviously to makee that GUI I used something specific to Windows, but the standard library and some packages are used worked flawlessly there.

Thank you guys for all the feedback! It's hard to please everybody on topics like this but I'm trying to make up mind on a model that fits well for the level of abstraction that we pretend for Crystal, but at the same time don't overcomplicate things for unexpected scenarios or people that is trying to push the language forward the limits we could have thought.

Checking how Rust deal with these things for example, they have a struct Error and the implementations of that struct may contain the OS error (errno) or not. There is a raw_os_error function that returns the int value in case the error instance has support for it.

In Crystal terms I think it could be translated to an optional value within the Exception class. So, exceptions created from an OS error could have that property, and although in most cases it wouldn't be necessary to check its value, it's still there for someone who requires it.

I'm already working on all these changes and I'm planning to refactor the current Errno class into an enum. The same would happen for WinError and I see this os_error property within Exception could have it's type different depending on the platform. So it would be an Errno for Unix and Errno | WinError for Windows. Platform dependent, but irrelevant for most cases if we design the exception hierarchy properly. It should be properly documented and discouraged to be used unless strictly necessary.

@waj Alternatively to a dedicated property raw_os_error, we could also use the already existing cause property for this. It's essentially the same concept: A specific exception like IO::NotFoundError is caused by some kind of OSError. So it would make sense to use that, when it's already available: IO::NotFoundError.new("error message", cause: OSError.new(Errno.value)). This adds only a little overhead for allocating the second exception instance. But that's negligible compared to the cost for unwinding the stack. We actually don't need to unwind twice: when both exceptions are instantiatedraised at the same location, they can share the same callstack instance.

@straight-shoota I considered that option but I don't like it for some reasons:

  • It feels hacky: It's not an actual raised exception
  • Checking for the OS error requires looking into the causes recursively
  • The stack is not always the same. Some places where there's an abstraction of the system the Errno is returned and processed somewhere up in the stack where the exception is actually raised.
  • An option is raise without the backtrace, but again it feels hacky
  • The backtrace is assigned when the exception is raised, not when created

I agree with @waj . All you need is the errno code, which is a number. In Windows it's probably similar. Adding an intermediate exception will make things a bit more cumbersome.

It feels hacky: It's not an actual raised exception

Well it's the libc equivalent of an error type in Crystal.

Checking for the OS error requires looking into the causes recursively

Are you suggesting raw_os_error should recurse?

Are you suggesting raw_os_error should recurse?
No, I didn't explain correctly. cause is recursive, so under the situation of nested exceptions it's hard to distinguish OS error from real nested (re-raised) exceptions.

I don't know. At the end I think this item is really subjective. I'm preparing the PR and I already have the Errno and WinError as enums. It provides all the available information with a really minimal overhead, and that's what we often do in Crystal: provide lightweight abstractions for simple things, with minimal or not overhead at all.

Was this page helpful?
0 / 5 - 0 ratings