Chapel: Allow messages in new Errors

Created on 18 May 2018  路  27Comments  路  Source: chapel-lang/chapel

Straightforward feature request, allow Error() to accept a message which is displayed when the error is thrown.

Libraries / Modules Feature Request user issue

Most helpful comment

With the caveat that I'm probably the least error-savvy person commenting here, the name StringError suggests to me "an error occurred when you tried to do something with a string" rather than "this error carries a string with it that will serve as its default message". Is there a precedent for this name, or another like what we're looking for, in other error hierarchies? (I generally like the idea of having such a subclass of Error, so am only balking at StringError as the name).

All 27 comments

I believe the string field was intentionally left out of Error() base class for performance reasons (although it's not totally clear to me) - @mppf or @psahabu could probably elaborate further. There has also been discussion about providing a subclass of error that would provide this feature.

I think either of these solutions would be nice from a user-perspective.

It's more that something doesn't need to have a string error in order to be throwable. But I have no objection to StringError or something like that.

We have this in NumSuch

class DimensionMatchError : NumSuchError {
  var msg: string,
      expected: int,
      actual: int;

  proc init(expected: int, actual:int, msg="") {
    super.init();
    this.complete();
    this.expected = expected;
    this.actual = actual;
    this.msg = msg;
  }

  proc message() {
    return this.msg + "Error matching dimensions.  Expected: " + this.expected + " Actual: " + this.actual;
  }
}

So I'll use that for now, but it feels like a base language issue. How do the other kids do it?

Going off of what @mppf said, Error is used to create a throwable thing that can print a message. For a throwable thing that contains only a string for the message, we can create a separate StringError.

With the caveat that I'm probably the least error-savvy person commenting here, the name StringError suggests to me "an error occurred when you tried to do something with a string" rather than "this error carries a string with it that will serve as its default message". Is there a precedent for this name, or another like what we're looking for, in other error hierarchies? (I generally like the idea of having such a subclass of Error, so am only balking at StringError as the name).

It's more that something doesn't need to have a string error in order to be throwable.

Error is used to create a throwable thing that can print a message.

@mppf && @psahabu - Say the base class Error did have a message field that defaulted to an empty string. What would the impact be on users?

My desire for this (and I imagine @buddha314's as well) is coming from python, where you can attach a message to any exception class:

raise NameError('Something went wrong')

What would the impact be on users?

I've been curious what others would say about this, but since nobody has yet... my assumption is that it would require storing and constructing the empty string for all errors, even if some swath of error cases didn't want or need it. Hard to say how bad that would be. (I also have to admit that I dislike thrashing in design and wish that @ben-albrecht had raised the question at the recent time when we were removing the string from the top-level Error class rather than now).

Reversing the question: What would the impact be on users if you had to use an "ErrorWithString" subclass to get the Python-like behavior rather than the top-level "Error" class?

Of course, I have an opinion! Doing new Error("hey stupid!") is pretty natural, it's how I treat my kids. However, if an Error Primer said something like "the default Error class does not take a string argument for performance reasons. Instead use ErrorMsg("hey stupid")" I think that would be pretty easy to find an follow. I'd even Stack Overflow it if you like when it's ready.

What would the impact be on users if you had to use an "ErrorWithString" subclass to get the Python-like behavior rather than the top-level "Error" class?

One impacted use-case is a user trying to give a custom error message to an error-type from the standard error hierarchy:

throw new IOError('Unable to write to output file');

With some error types including the message field and others not, this could lead to confusion for users when they were able to override the attached message for one type of error, but not another.

Other than that, needing to use ErrorWithString (or whatever it would be named) over Error would probably be a common gotcha for users coming from languages with message-supported errors. I agree with @buddha314 that this could be dealt with by providing good documentation and examples.

Could you get both behaviors and keep the messageless efficiency if
the Error's optional message is stored indirectly in another class:

class Error {
  var msg: owned aString;

  proc init() { msg = nil; }
  proc init(s: string) { msg = new owned aString(s); }
}

What about the void type?

class Error {
  var msg;
  proc init() {
    msg = _void;
  }
  proc init(s: string) {
    msg = s;
  }
}

I thought then catch e: IOError wouldn't compile due to
genericness, since e.msg would need to be of a single static type.
Guess I could have tested it.

@cassella Don't over-think it. Just launch the question out there and realize 5 minutes of pride for "you idiot, it doesn't work" is cheaper than 20 minutes thinking you're doing it wrong.

@cassella - that may very well be true, I haven't fully tested. I was just throwing ideas against the wall

I tried both suggestions out in ChapelError.chpl, with the following test case:

proc main() throws {
  throw new Error('foo');
}

The class-wrapped approach seems to work as intended, whereas the void approach results in the following error when I tried it:

$CHPL_HOME/modules/internal/ChapelError.chpl:113: error: returning a generic type variable is not supported
$CHPL_HOME/modules/internal/ChapelBase.chpl:887: internal error: assertion error [functionResolution.cpp:1927]
Note: This source location is a guess.

An approach like this is appealing to me, if we're OK with error instances carrying around an extra pointer by default.

Reminded of this issue when I recently had to reimplement Errors with messages yet again.

To summarize where we left off, class-wrapping the message and storing as nil by default seems like a reasonable solution to providing messages in Errors by default without a significant performance overhead.

Are there any objections to this approach?

I keep hoping to hear more from @mppf on this issue since I believe he was involved in the effort to remove the string from the root Error class (?). I don't trust my instincts, because I've never really worked much with languages that have an established error hierarchy, but wonder whether the inconvenience in having to class-wrap a message is really better than either learning to use a StringError subclass or always paying the cost of an empty / unused string.

but wonder whether the inconvenience in having to class-wrap a message is really better than either learning to use a StringError subclass or always paying the cost of an empty / unused string.

Good point.

As demonstrated in the above example, there would be no inconvenience to the user, since the class-wrappedness of the string is handled within the implementation.

For the implementer of new error subclasses, we might be able to abstract away the fact that message is a class-wrapped string, through a mechanism like forwarding.

I keep hoping to hear more from @mppf on this issue since I believe he was involved in the effort to remove the string from the root Error class (?).

I said earlier: https://github.com/chapel-lang/chapel/issues/9584#issuecomment-390312083

It's more that something doesn't need to have a string error in order to be throwable.

In other words, since we can only throw a subclass of Error, I wanted that class to be as simple as possible. Having a string field is not fundamental to error handling, even if it is convenient. I still think this is a good reason.

What would the drawback be of having Error containing a string field actually be? Besides making the interface less "pure", it adds to memory usage. I doubt it would have any significant performance impact. In fact I think the discussion about the performance impact of it is a distraction from the real problem, which I get to below.

@buddha314 said earlier:

Of course, I have an opinion! Doing new Error("hey stupid!") is pretty natural, it's how I treat my kids. However, if an Error Primer said something like "the default Error class does not take a string argument for performance reasons. Instead use ErrorMsg("hey stupid")" I think that would be pretty easy to find an follow. I'd even Stack Overflow it if you like when it's ready.

I think having a class ErrorMsg : Error that had the string field, possibly with another name, like StringError would be OK. We'd just need to call it out in the Errors document.

In any case doing something like throw new Error("It didn't work") or throw new ErrorMsg("could not open file") goes against what I would consider to be reasonable usage of the error handling mechanism. The error handling mechanism allows errors to be filtered/handled/caught/rethrown based on the dynamic type of the error. I really really really don't want to see libraries that throw Error with different string messages and then applications that check for particular string error messages to handle them. I think that not having a string message field in Error nudges users in the right direction (you should be creating new subclasses of Error).

For this reason, even if we added ErrorMsg, I'd want it to be pure virtual; I'd want you to not be able to create and throw one directly, just as I don't think it's reasonable to throw new Error(). I don't think we have a good way to enforce that today.

This is one of those moments where there's a tension between code that's quick/easy to write and code that makes a reasonable library. My hope is that the protoype module idea makes space for quick code, but it's certainly the case that even in that context one might want to "throw an error" when sketching out the program. That would be the time to use ErrorMsg directly, and maybe new ErrorMsg would only be allowed in prototype modules.

I think having a class ErrorMsg : Error that had the string field, possibly with another name, like StringError would be OK. We'd just need to call it out in the Errors document.

As mentioned above, one drawback to this approach is that it leads to a division of errors in the error hierarchy - those that do have messages and those that don't. Users who tend to attach additional information to errors will inevitably stumble over this division constantly, because it won't be easy to remember which errors fall into what category. Giving up some of the purity of the minimalistic base Error class will remove this "gotcha" from the language's error handling implementation.

The error handling mechanism allows errors to be filtered/handled/caught/rethrown based on the dynamic type of the error. I really really really don't want to see libraries that throw Error with different string messages and then applications that check for particular string error messages to handle them.

Messages attached to errors are generally useful for carrying additional information beyond the error name and stack trace in a user-friendly format, to help the user understand what went wrong if the error is thrown and uncaught. I think including additional information about the error is a good practice we want to nudge users towards.

The argument about abusing the messages to determine flow of the program is interesting. I would consider this kind of abuse well-established as a bad practice and wouldn't put too much weight on swaying users away from it in our design decisions. Anecdotally, I have not yet witnessed this kind of abuse in languages that support messages in errors by default, such as python.

I think that not having a string message field in Error nudges users in the right direction (you should be creating new subclasses of Error).

I don't buy this as an argument for not making the base class more generally useful. If we want to disallow throwing a base Error, we should look into mechanisms of enforcing this.

Summarizing my take on this:

Today, attaching a message to an error requires a user to look up the Error.chpl implementation, find an example of an error that attaches a message such as IllegalArgumentError, and define their own error type with a message field, borrowing a subset of the implementation from IllegalArgumentError. I believe we all agree this is not an acceptable story for attaching messages to errors.

Supporting ErrorMsg or StringError would be a step in the right direction from where we are today. However, unless we come up with an easy way to distinguish errors with and without messages (such as naming scheme, submodule division, etc.), I predict that users will frequently stumble over the pitfall of passing a message to an error that was derived from Error instead of StringError. This will impact users coming from languages with messages built into all errors, especially.

I think sacrificing some purity of the base Error implementation is worth the increased usefulness and consistency of Chapel's errors.

I said earlier: #9584 (comment)

I'd seen that, but was hoping for more, which is why I asked for it (and got it this time, thanks).

However, unless we come up with an easy way to distinguish errors with and without messages (such as naming scheme, submodule division, etc.), I predict that users will frequently stumble over the pitfall of passing a message to an error that was derived from Error instead of StringError.

I'm not sure I buy this argument Ben. How is this different from any other class hierarchy where you can only understand what properties a given subclass does or doesn't have without paying attention to it?

Logistically, I'd like to park this discussion until we get past the code/doc freeze for 1.18 given that it's not something we're likely to act on in this release (nor do I think you're proposing we do). Everybody has lots to do and I'd prefer not to rush this discussion or have it impact their other deadlines.

Ran into this again

Ran into this again

@lydia-duncan - Whenever you get time, could you elaborate and maybe mention if it's a use-case that hasn't been explored yet?

Fair XD

I wanted a good partial solution to "hey, there should probably be an Error subclass hierarchy here but that'll take design time and in the meanwhile I just want to throw an Error instead of halting". Being able to write new SomeError("my message") would allow me to give information about what went wrong in the meanwhile.

FWIW, I think the base Error class not supporting a string message
field is going to make things more confusing and less convenient for
users.

I think @ben-albrecht's saying that given a hierarchy of Error that
doesn't include a string, (say, NodeError) every time you want to
add a subclass (say, NodeMemError) that takes a string, you'd have
to do some tedious work to hook it up to the new class's .message().
(Possibly combine it with super.message()?)

(And if the answer is to have NodeError and its whole hierarchy be
stringful, wouldn't that be the case for most error hierarchies? And
if so, I think it would be more convenient if that support were in
Error to start with.)

If I follow @mppf's Error / StringError suggestion, I think when
defining your own hierarchy, you'd have to have separate hierarchies
for stringless and stringful errors. (E.g. one starting from
NodeError: Error, and one from NodeStringError: StringError that
NodeMemError could inhert from). But then when someone wants to
catch all errors that are conceptually Node errors -- and only those
errors -- they'd have to write something about both hierarchies, even
though the stringfulness of some of the errors is immaterial to them:

try {
   something;
} catch e: NodeError {
   do something about a NodeError;
} catch e: NodeStringError {
   do probably the same something;
}

(Since NodeError derives from Error and not from StringError, I
can't make NodeMemError be a subclass of NodeError and also
inherit StringError's stringfulness. So I can't write a single
catch that will catch only my Node-related errors since they're
split between two hierarchies.)

(And if NodeError has any nontrivial implementation, it would need
to be replicated in NodeStringError.)

Would it help to consider the actual string field to be a private
field? (Looking towards support of private fields.) If one of the
base Error class's initializers accepted a string, but there was
no direct way for a catcher to see it? (Besides calling .message()
which they can already do)

I think @ben-albrecht's saying that given a hierarchy of Error that
doesn't include a string, (say, NodeError) every time you want to
add a subclass (say, NodeMemError) that takes a string, you'd have
to do some tedious work to hook it up to the new class's .message().
(Possibly combine it with super.message()?)

I think this is just going to be the way it is. E.g. even if we had Error have a string field, a subclass like NodeError could choose not to use it, and not provide an initializer accepting a string at all. Then the subclass wanting a string argument (NodeMemError in your example) needs to create a proc message that combines the string with the parent error's message.

(And if the answer is to have NodeError and its whole hierarchy be
stringful, wouldn't that be the case for most error hierarchies? And
if so, I think it would be more convenient if that support were in
Error to start with.)

I don't think the existence of the field is what matters here as much as what NodeError provides as an initializer. Does it provide initializer accepting a string message?

If I follow @mppf's Error / StringError suggestion, I think when
defining your own hierarchy, you'd have to have separate hierarchies
for stringless and stringful errors.

I don't see why having a string field in Error changes this situation at all. The stringless/stringful errors are defined by whether or not their initializer accepts a string, and that's still up to whomever designs these classes. When designing such a class hierarchy, it may in fact be reasonable to make all of them accept a string, and store that string in a field in a parent class. But it is not obvious to me why it helps this situation much to put the string field in Error. (If that did matter, couldn't such a hierarchy inherit from StringError?)

But back to the point in the issue:

Error() to accept a message which is displayed when the error is thrown.

I still don't think we should do this, because I'd like code to avoid throwing Error in general. (We do it in some tests of error handling, but I view it as an anti-pattern). This is the reason I'm much more comfortable with StringError.

If there were some reason why having a string field in Error would help, I'm open to that, but I do not want it to be possible to write new Error("Some Error Text") - or at least I'd want such code to generally produce a warning, since it's generally going against the design and intended use of error handling. In particular, how could one possibly catch and respond to such an error without starting to do string comparisons on the error message?

For archival purposes:

In an offline meeting we reached a consensus to allow messages in the base Error class (Implemented in #14545).

Was this page helpful?
0 / 5 - 0 ratings